Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Noob regex poser (match MAY contain and MUST have)

Tags:

regex

Probably really simple for you Regex masters :) I'm a noob at regex, just having picked up some PHP, but wanting to learn (once this project is complete, I'll knuckle down and crack regular expressions). I'd like to understand how to compose a regex that may contain some data, but must contain other.

My example being, the match MAY begin with numbers but doesn't have to, however if it does, I need the number and the following 2 words. If it doesn't begin with a number, just the first 2 words. The data will be at the beginning of the string.

The following would match:

  • 123 Fore Street, Fiveways (123 Fore Street returned(no comma))
  • Our House Village (Our House returned)
  • 7 Eightnine (7 Eightnine returned)

Thanks

like image 628
MrG Avatar asked Oct 25 '25 22:10

MrG


2 Answers

Something like this should work:

^((?:\d+\s)?\w+(?:\s\w+)?)

You can test it out somewhere like http://rubular.com/ before coding it, it's usually easier.

What it means:

^ -> beginning of the line

(?:\d+\s)? -> a non capturing group, (marked by ?:), consisting of several digits and a space, since we follow it by ?, it's optional.

\w+(?:\s\w+)? -> several alphanumeric characters (look up what \w means), followed by, optionally, a space and another "word", again in a non capturing group.

The whole thing is encapsulated in a capturing group, so group 1 will contain your match.

like image 136
pcalcao Avatar answered Oct 28 '25 16:10

pcalcao


Use this regex with multiline option

^(\d+(\s*\b[a-zA-Z]+\b){1,2}|(\s*\b[a-zA-Z]+\b){1,2})

Group1 contains your required data

\d+ means match digit i.e \d 1 to many times+

\s* means match space i.e \s 0 to many times*

(\s*\b[a-zA-Z]+\b){1,2} matches 1 to 2 words..

like image 29
Anirudha Avatar answered Oct 28 '25 15:10

Anirudha