I'm having trouble matching the underscore character in Python using regular expressions. Just playing around in the shell, I get:
>>> import re
>>> re.match(r'a', 'abc')
<_sre.SRE_Match object at 0xb746a368>
>>> re.match(r'_', 'ab_c')
>>> re.match(r'[_]', 'ab_c')
>>> re.match(r'\_', 'ab_c')
I would have expected at least one of these to return a match object. Am I doing something wrong?
Use re.search
instead of re.match
if the pattern you are looking for is not at the start of the search string.
re.match(pattern, string, flags=0)
Try to apply the pattern at the start of the string, returning a match object, or None if no match was found.
re.search(pattern, string, flags=0)
Scan through string looking for a match to the pattern, returning a match object, or None if no match was found.
You don't need to escape _
or even use raw string.
>>> re.search('_', 'ab_c')
Out[4]: <_sre.SRE_Match object; span=(2, 3), match='_'>
Try the following:
re.search(r'\_', 'ab_c')
You were indeed right to escape the underscore character! Mind that you can only use match for the beginning of strings, as is also clear from the documentation (https://docs.python.org/2/library/re.html):
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.
You should use search
in this case:
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With