Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - regex - Splitting string before word

I am trying to split a string in python before a specific word. For example, I would like to split the following string before "path:".

  • split string before "path:"
  • input: "path:bte00250 Alanine, aspartate and glutamate metabolism path:bte00330 Arginine and proline metabolism"
  • output: ['path:bte00250 Alanine, aspartate and glutamate metabolism', 'path:bte00330 Arginine and proline metabolism']

I have tried

rx = re.compile("(:?[^:]+)")
rx.findall(line)

This does not split the string anywhere. The trouble is that the values after "path:" will never be known to specify the whole word. Does anyone know how to do this?

like image 752
Dyna Avatar asked Oct 23 '25 22:10

Dyna


1 Answers

using a regular expression to split your string seems a bit overkill: the string split() method may be just what you need.

anyway, if you really need to match a regular expression in order to split your string, you should use the re.split() method, which splits a string upon a regular expression match.

also, use a correct regular expression for splitting:

>>> line = 'path:bte00250 Alanine, aspartate and glutamate metabolism path:bte00330 Arginine and proline metabolism'
>>> re.split(' (?=path:)', line)
['path:bte00250 Alanine, aspartate and glutamate metabolism', 'path:bte00330 Arginine and proline metabolism']

the (?=...) group is a lookahead assertion: the expression matches a space (note the space at the start of the expression) which is followed by the string 'path:', without consuming what follows the space.

like image 196
Adrien Plisson Avatar answered Oct 26 '25 13:10

Adrien Plisson