I want to know how capturing groups (or non capturing) are affecting lookarounds in Regex. Here are 2 example:
test (?:(?!<start).)+
test (?!<start).+
I would appreciate if anybody can explain how regex engine is interpreting both cases in details.
\b vs. (\b). Edge cases involve back-referencing an optional group, but that isn't very interesting.(?=...) and (?<=...) - can capture groups. For example, /(?=(\b\w+\b))/ will result in positive empty matches, where each match has a non-empty group. For example, /(?<=(.))\1/ will match characters that follow identical characters.(?!...) and (?<!...) - cannot capture groups. That makes a lot of sense when you think about it, because the never match, but they can use capturing groups within them. For example, ^(?!.*(.).*\1).*$ will match a line that does not contain duplicated letters. Again, how \1 behaves, in that case, out of the group is not particularity interesting.Now, to your example. The two patterns match different texts:
(?:(?!<start).)+ - Check we are not after the text start, and then match all characters (of the line). Examples:
"start1234end", matches the whole input - the start position isn't after the word "start"."before123startAfter" Suppose the previous match was "before123start" (on a different pattern the allows that), the next match cannot start here, and will skip one character: "fter".(?:(?!<start).)+ - Here, the lookbehind assertion is repeated for every character (for intuition: if a group (?:...)+ is a loop, the assertion is inside the loop). A character will not be matched if it is directly after the string start:
"start1234end" - First match will be "start". The engine cannot match the next '1' (because it isn't a character that isn't after start), so the match stops. The next match will be "234end".If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With