Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid capturing groups in RegEx splitting result?

I'm trying to use re to match a pattern that starts with '\n', followed by a possible 'real(r8)', followed by zero or more white spaces and then followed by the word 'function', and then I want to split the string at where matches occur. So for this string,

text = '''functional \n   function disdat \nkitkat function wakawak\nreal(r8) function noooooo \ndoit'''

I would like:

['functional ',
 ' disdat \nkitkat function wakawak',
 ' noooooo \ndoit']

However,

regex = re.compile(r'''\n(real\(r8\))?\s*\bfunction\b''')

regex.split(text)

returns

['functional ',
 None,
 ' disdat \nkitkat function wakawak',
 'real(r8)',
 ' noooooo \ndoit']

split returns the matches' groups too. How do I ask it not to?

like image 945
qAp Avatar asked Oct 16 '25 14:10

qAp


1 Answers

You can use non-capturing groups, like this

>>> regex = re.compile(r'\n(?:real\(r8\))?\s*\bfunction\b')
>>> regex.split(text)
['functional ', ' disdat \nkitkat function wakawak', ' noooooo \ndoit']

Note ?: in (?:real\(r8\)). Quoting Python documentation for (?:..)

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

like image 193
thefourtheye Avatar answered Oct 18 '25 02:10

thefourtheye



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!