Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I remove multiple consecutive characters within a word with regular expressions?

Tags:

python

regex

I want a regular expression (in Python) that given a sentence like:

heyy how are youuuuu, it's so cool here, cooool.

converts it to:

heyy how are youu, it's so cool here, cool.

which means maximum of 1 time a character can be repeated and if it's more than that it should be removed.

heyy ==> heyy
youuuu ==> youu
cooool ==> cool
like image 268
Ash Avatar asked Dec 22 '25 08:12

Ash


1 Answers

You can use back reference in the pattern to match repeated characters and then replace it with two instances of the matched character, here (.)\1+ will match a pattern that contains the same character two or more times, replace it with only two instances by \1\1:

import re
re.sub(r"(.)\1+", r"\1\1", s)
# "heyy how are youu, it's so cool here, cool."
like image 197
Psidom Avatar answered Dec 24 '25 00:12

Psidom