Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: How to match sequence of SAME characters?

Tags:

python

regex

I am trying to match "any consecutive chains of SAME character that is NOT .(period) "

Lets say I have

line = '....xooo......'

If I do this,

match in re.findall(r'[^\.]{2,}', line)

match returns "xooo".

Instead, I only want "ooo," which is a sequence of SAME character..

How do I do this?

like image 596
user2492270 Avatar asked Nov 19 '25 20:11

user2492270


1 Answers

re.search(r'(([^.])\2{1,})', line).group(1)

Explanation:

"(([^.])\2{1,})"
    1st Capturing group (([^.])\2{1,})
    2nd Capturing group ([^.])
      Negated char class [^.] matches any character except:
         . The character .
    \2 1 to infinite times [greedy] Matches text saved in the 2nd capturing group

If you want all the matches of that constraint:

>>> line = '....xooo...xx..yyyyy.'
>>> map(lambda t: t[0], re.findall(r"(([^.])\2+)", line))
# ['ooo', 'xx', 'yyyyy']
like image 153
dawg Avatar answered Nov 21 '25 10:11

dawg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!