I have the following code line which is splitting the string data2 up into a list upon instances of a white space:
string_list = data2.split()
However in some of my data there are dates in the format "28, Dec". Here the above code is splitting on the white space between the date and the month when I don't want it to. Is there a way I can say "split on the white space, but not if it is after a comma"?
You need to use regular expressions.
>>> re.split('(?<!,) ', 'blah blah, blah')
['blah', 'blah, blah']
From the link:
(?<!...)Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With