I am trying to match "any consecutive chains of SAME character that is NOT .(period) "
Lets say I have
line = '....xooo......'
If I do this,
match in re.findall(r'[^\.]{2,}', line)
match returns "xooo".
Instead, I only want "ooo," which is a sequence of SAME character..
How do I do this?
re.search(r'(([^.])\2{1,})', line).group(1)
Explanation:
"(([^.])\2{1,})"
1st Capturing group (([^.])\2{1,})
2nd Capturing group ([^.])
Negated char class [^.] matches any character except:
. The character .
\2 1 to infinite times [greedy] Matches text saved in the 2nd capturing group
If you want all the matches of that constraint:
>>> line = '....xooo...xx..yyyyy.'
>>> map(lambda t: t[0], re.findall(r"(([^.])\2+)", line))
# ['ooo', 'xx', 'yyyyy']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With