Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python re.sub() with dots

Tags:

python

regex

I would like to put a space before and after a dot in text but only if its not a part of a date.

So far I have this, and I figured out I have to do something with \D\. but it gives back the letter before the dot not only the dot:

string = re.sub("\.", " . ", string)

For example:

Input text:

1992.01.04 is my birthday.

Required output:

1992.01.04 is my birthday .

There is a space before the end of string dot.

Other question is the same with colon and time,

Input text:

The time is 11:48, reported by: Tom.

Required output:

The time is 11:48, reported by : Tom.

There is a space after text 'reported by' before the colon.

like image 590
ptamas90 Avatar asked Oct 20 '25 14:10

ptamas90


2 Answers

You can use this regex which does a negative look ahead and negative look behind to check if dot/colon is surrounded by digits and replace it with ' \1 '

(?<!\d\d)([.:])(?!\d+)

Demo, https://regex101.com/r/hr6slz/4

This regex works for both colon and dot and as you can replace it by ' \1 '

like image 139
Pushpesh Kumar Rajwanshi Avatar answered Oct 22 '25 04:10

Pushpesh Kumar Rajwanshi


You need _ positive lookbehind assertion._ (or negative, with \d). Look into https://docs.python.org/3/library/re.html for details.

re.sub("(?<=\D)\.(\D?)", " . ", '1992.01.04 is my birthday.')
like image 34
Slam Avatar answered Oct 22 '25 05:10

Slam



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!