Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

non-capturing version of regular parentheses Python

Tags:

python

regex

My goals is to locate IP address inside a text.

Using grep, I was able to do it with the regular expression ([0-9]+\.){3}[0-9]+.

With re from Python, I don't understand why it doesn't work unless I precede the expression inside the parentheses with ?:

I understand that the use of ?: will prevent the creation of a group, but I can't explain the result when this prefix is deleted.

>>> s
'64 bytes from 10.11.1.5: icmp_seq=2 ttl=128 time=215 ms'
>>> p=re.compile(r"(?:[0-9]+\.){3}")
>>> p.findall(s)
['10.11.1.']
>>> p=re.compile(r"([0-9]+\.){3}")
>>> p.findall(s)
['1.']
like image 523
jeff Avatar asked Oct 21 '25 15:10

jeff


1 Answers

See docs for re.findall:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

Emphasis mine. There are no capturing groups in your first pattern, so it returns the one full match in the input provided, as a string:

['10.11.1.']

But with ([0-9]+\.){3}, you do have a capturing group, so rather than returning the full match as a string, it returns a list of groups. Remember that

A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

which is why only the last repitition of the group is seen in the result, as ['1.']. (The full match is not included, only the captured groups are)

like image 50
CertainPerformance Avatar answered Oct 23 '25 04:10

CertainPerformance



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!