Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression for a specific pair or each item in a pair

I have a situation where I might be getting one or both of a pair of characters and I want to match either.

For example:

str = 'cddd a dfsdf b sdfg ab uyeroi'

I want to match any "a" or "b" or "ab". If the "ab" comes together I want to catch it as a single match (not as two matches "a" "b"). If I get "ab" it will always be in that order ("a" will always precede "b")

What I have is:

/[ab]|ab/

But I'm not sure if the ab is going to be a stronger match term than the [ab].

Thanks for the assistance.

like image 478
James Fassett Avatar asked Dec 05 '25 07:12

James Fassett


1 Answers

Your current expression will not do what you want in most popular regular expression engines - it will match a or b. The behaviour depends on the implementation of the regex engine:

You can easily find out whether the regex flavor you intend to use has a text-directed or regex-directed engine. If backreferences and/or lazy quantifiers are available, you can be certain the engine is regex-directed. You can do the test by applying the regex regex|regex not to the string regex not. If the resulting match is only regex, the engine is regex-directed. If the result is regex not, then it is text-directed. The reason behind this is that the regex-directed engine is "eager".

If you are using a regex-directed engine then to fix it you could reverse the order of the terms in the alternation to ensure it attempts to match ab first:

/ab|[ab]/

Or you could rewrite the expression so that the order doesn't matter:

/ab?|b/
like image 74
Mark Byers Avatar answered Dec 07 '25 22:12

Mark Byers



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!