I need to extract the housenumber with all the different constellations in austria:
| Street name | housenumber | stairs | floor | door |
| --------------------------------------- | ----------- | ------ | ----- | ---- |
| Lilienstr. 12a | 12a | | | |
| Leibnizstraße 36/28/2 | 36 | 28 | | 2 |
| Prager Straße 14/3/1/4 | 14 | 3 | 1 | 4 |
| Guentherstr. 43 B | 43 B | | | |
| Eberhard-Leibnitz Str. 1/7 | 1 | | | 7 |
| Schießstätte 7/7 | 7 | | | 7 |
I've already found this question: Regex to extract (german) street number.
This works if no stair/floor/door is entered. Can you help?
^[ \-0-9a-zA-ZäöüÄÖÜß.]+?\s+(\d+(\s?[a-zA-Z])?)\s*(?:$|\(|[A-Z]{2})
The credits for the core of the pattern using the optional capturing groups with a positive lookahead go to @JvdV as he suggested with his pattern in the comments.
As an alternative, you can get the group numbers / names in the order of the specified schedule in the question, by capturing the digits of the stairs / floor / door and asserting how many parts of a forward slash followed by a digit are directly to the right.
If the assertion fails, the pattern will try the next part as all the groups are optional.
^(?<address>(?<streetname>\h*\S.*?)\h*(?<housenumber>\d+\h*[A-Za-z]?))(?:/(?<stairs>\d+)(?=(?:/\d+){1,2}))?(?:/(?<floor>\d)(?=(?:/\d+)))?(?:/(?<door>\d+))?$
Regex demo | Php demo
In parts
^ Start of string(?<address> Group address
(?<streetname> Group streetname
\h*\S.*? Match 0+ horizontal whitespace chars, a non whitepace char to make sure address is not empty and match any char as least as possible (non greedy)) Close group streetname\h* Match 0+ horizontal whitespace chars for the trailing spaces after the streetname(?<housenumber> Group housenumber
\d+\h*[A-Za-z]? Match 1+ digits, 0+ horizontal whitespace chars and optional char a-zA-Z) Close group housenumber) Close group address(?: Non capture group
/(?<stairs>\d+) Group stairs, match 1+ digits(?=(?:/\d+){1,2}) Positive lookahead, assert what is at the right is 1 or 2 times / followed by 1 or 2 digits)? Close group and make it optional(?: Non capture group
/(?<floor>\d+) Group floor, match 1+ digits(?=(?:/\d)) Positive lookahead, assert what is at the right is / followed by a digit)? Close group and make it optional(?: Non capture group
/(?<door>\d+) Group door, match 1+ digits)? Close group and make it optional$ End of stringExample code
$re = '~^(?<address>(?<streetname>\h*\S.*?)\h*(?<housenumber>\d+\h*[A-Za-z]?))(?:/(?<stairs>\d+)(?=(?:/\d+){1,2}))?(?:/(?<floor>\d)(?=(?:/\d+)))?(?:/(?<door>\d+))?$~m';
$strings = [
"Lilienstr. 12a",
"Leibnizstraße 36/28/2",
"Prager Straße 14/3/1/4",
"Guentherstr. 43 B",
"Eberhard-Leibnitz Str. 1/7",
"Schießstätte 7/7"
];
foreach ($strings as $string) {
preg_match_all($re, $string, $matches, PREG_SET_ORDER);
$address = array_filter($matches[0], "is_string", ARRAY_FILTER_USE_KEY); // from php 5.6
print_r($address);
}
Output
Array
(
[address] => Lilienstr. 12a
[streetname] => Lilienstr.
[housenumber] => 12a
)
Array
(
[address] => Leibnizstraße 36
[streetname] => Leibnizstraße
[housenumber] => 36
[stairs] => 28
[floor] =>
[door] => 2
)
Array
(
[address] => Prager Straße 14
[streetname] => Prager Straße
[housenumber] => 14
[stairs] => 3
[floor] => 1
[door] => 4
)
Array
(
[address] => Guentherstr. 43 B
[streetname] => Guentherstr.
[housenumber] => 43 B
)
Array
(
[address] => Eberhard-Leibnitz Str. 1
[streetname] => Eberhard-Leibnitz Str.
[housenumber] => 1
[stairs] =>
[floor] =>
[door] => 7
)
Array
(
[address] => Schießstätte 7
[streetname] => Schießstätte
[housenumber] => 7
[stairs] =>
[floor] =>
[door] => 7
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With