Let's say I have a string that looks like this:
one two three "four five"
I'd like to split such that I get an array:
['one', 'two', 'three', 'four five']
using split with ' ' will not be enough here. I have to separate out the double quotes first. Is there a best practice technique to do this? or should I re-invent the wheel and do it myself?
Use the following regex matching:
import re
s = 'one two three "four five"'
words = re.findall(r'(\w+|"[^"]+")', s)
print(words)
(a|b) - matches either what is before the | or what is after it\w+ - match any word"[^"]+" - match any sequence of characters except " which is surrounded by quotes['one', 'two', 'three', '"four five"']
With "should I re-invent the wheel and do it myself?" as primary requirement we can reuse something meant for something else
import csv
import io
s='one two three "four five"'
f=csv.reader(io.StringIO(s),dialect="excel",delimiter=" ")
for i in f:
print(i)
['one', 'two', 'three', 'four five']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With