Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a string into a list of tuples based selectively on specific commas within the string

I have a long Python string of the form:

string='Black<5,4>, Black<9,4>'

How can I split this string, and any other of arbitrary length which has the same form (i.e. <ArbitraryString1<ArbitraryListOfIntegers1>,<ArbitraryString2<ArbitraryListOfIntegers2>,...) into a list of tuples.

For example, the following would be the desired output from string:

list_of_tuples=[('Black',[5,4]),'Black,[9,4])

Usually I'd use string.split on the commas to produce a list and then regex to separate the word from the <> but since I need to use commas to delimit my indices (the contents of the <>), this doesn't work.

like image 877
CiaranWelsh Avatar asked Feb 03 '26 15:02

CiaranWelsh


1 Answers

You may use a regex to capture 1+ word chars before a < and capture everything inside <...> into another group, and then split Group 2 contents with , casting the values to int:

import re
s='Black<5,4>, Black<9,4>'
print([(x, map(int, y.split(','))) for x,y in re.findall(r'(\w+)<([^<>]+)>', s)])
# => [('Black', [5, 4]), ('Black', [9, 4])]

See the Python demo

Pattern details:

  • (\w+) - group 1 (assigned to x): 1 or more word chars
  • < - a literal <
  • ([^<>]+) - Group 2 (assigned to y): 1+ chars other than < and >
  • > - a literal >.
like image 122
Wiktor Stribiżew Avatar answered Feb 05 '26 08:02

Wiktor Stribiżew