Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: separating words using space, but preserving double quotes surrounded text as single unit

Tags:

python

Let's say I have a string that looks like this:

one two three "four five"

I'd like to split such that I get an array:

['one', 'two', 'three', 'four five']

using split with ' ' will not be enough here. I have to separate out the double quotes first. Is there a best practice technique to do this? or should I re-invent the wheel and do it myself?

like image 682
JasonGenX Avatar asked Dec 13 '25 13:12

JasonGenX


2 Answers

Use the following regex matching:

import re

s = 'one two three "four five"'
words = re.findall(r'(\w+|"[^"]+")', s)
print(words)
  • (a|b) - matches either what is before the | or what is after it
  • \w+ - match any word
  • "[^"]+" - match any sequence of characters except " which is surrounded by quotes

['one', 'two', 'three', '"four five"']
like image 84
RomanPerekhrest Avatar answered Dec 16 '25 05:12

RomanPerekhrest


With "should I re-invent the wheel and do it myself?" as primary requirement we can reuse something meant for something else

import csv
import io
s='one two three "four five"'
f=csv.reader(io.StringIO(s),dialect="excel",delimiter=" ")
for i in f:
    print(i)

['one', 'two', 'three', 'four five']
like image 37
geekay Avatar answered Dec 16 '25 04:12

geekay



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!