How to avoid capturing groups in RegEx splitting result?

Question

I'm trying to use re to match a pattern that starts with '\n', followed by a possible 'real(r8)', followed by zero or more white spaces and then followed by the word 'function', and then I want to split the string at where matches occur. So for this string,

text = '''functional \n   function disdat \nkitkat function wakawak\nreal(r8) function noooooo \ndoit'''

I would like:

['functional ',
 ' disdat \nkitkat function wakawak',
 ' noooooo \ndoit']

However,

regex = re.compile(r'''\n(real$r8$)?\s*\bfunction\b''')

regex.split(text)

returns

['functional ',
 None,
 ' disdat \nkitkat function wakawak',
 'real(r8)',
 ' noooooo \ndoit']

split returns the matches' groups too. How do I ask it not to?

thefourtheye · Accepted Answer

You can use non-capturing groups, like this

>>> regex = re.compile(r'\n(?:real$r8$)?\s*\bfunction\b')
>>> regex.split(text)
['functional ', ' disdat \nkitkat function wakawak', ' noooooo \ndoit']

Note ?: in (?:real$r8$). Quoting Python documentation for (?:..)

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

How to avoid capturing groups in RegEx splitting result?

Tags:

python

string

regex

split

qAp

1 Answers

thefourtheye

Recent Activity

Donate For Us

How to avoid capturing groups in RegEx splitting result?

Tags:

python

string

regex

split

qAp

1 Answers

thefourtheye

Related questions

Recent Activity

Donate For Us