Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match a pipe character in the middle of a string with a python regex

Tags:

python

regex

I am trying to match a pipe character in a string using a Python regex and I can't seem to get it to match. I've boiled it down to a simplified version.

Let's say I am looking for the sequence z|a in a string. Here are some possible regexes and the results:

>>> import re
>>> re.match(r'|', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a850>
>>> re.match(r'z|', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a780>
>>> re.match(r'|a', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a850>
>>> re.match(r'z|a', 'xyz|abc')
>>> re.match(r'z\|a', 'xyz|abc')
>>> re.match(r'z\\|a', 'xyz|abc')
>>> re.match(r'z\\\|a', 'xyz|abc')
>>> re.match(r'z[|]a', 'xyz|abc')
>>> 

So I can match with |, |a and z| but I can't find a way to match z|a. Any ideas?

like image 778
Hamish Downer Avatar asked Nov 28 '25 00:11

Hamish Downer


1 Answers

re.match() is looking for a match at the start of the string. Use re.search() instead.

The patterns you have that match are matching the empty string. i.e. r'|' is empty string or empty string, r'z|' is z or empty string and '|a' is empty string or a. all of those will match on any string.

>>> re.match('z\\|a', 'xyz|abc')
>>> re.search('z\\|a', 'xyz|abc')
<_sre.SRE_Match object at 0x02BF2BB8>
>>> re.search(r'z\|a', 'xyz|abc')
<_sre.SRE_Match object at 0x02BF2BF0>

More generally you can use re.escape() on a literal string that you need to include in the middle of a more complex regular expression to avoid having to figure out how many backslashes you need to unescape things.

like image 128
Duncan Avatar answered Nov 30 '25 14:11

Duncan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!