Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding \G and \K in regex

Tags:

regex

In a previous question, I asked to match chars that follow a specific pattern. In order to be more specific, I would like to consider this example:

We want to match all the x that follow b or d. We may want to replace these characters with o:

-a  x   x    xx x  x
-b x  x x   x   xx x
-c  x  x   x   x x x
-d x  x   x  xx x  x

The result would be this:

-a  x   x    xx x  x
-b o  o o   o   oo o
-c  x  x   x   x x x
-d o  o   o  oo o  o

anubhava answered my question with a pretty nice regex that has the same form as this one:

/([db]|\G)[^x-]*\Kx/g

Unfortunately I did not completely understand how \G and \K work. I would like to have a more detailed explaination on this specific case.

I tried to use the Perl regex debugger, but It is a bit cryptic.

Compiling REx "([db]|\G)[^x-]*\Kx"
Final program:
   1: OPEN1 (3)
   3:   BRANCH (15)
   4:     ANYOF[bd][] (17)
  15:   BRANCH (FAIL)
  16:     GPOS (17)
  17: CLOSE1 (19)
  19: STAR (31)
  20:   ANYOF[\x00-,.-wy-\xff][{unicode_all}] (0)
  31: KEEPS (32)
  32: EXACT <x> (34)
  34: END (0)
like image 568
nowox Avatar asked Oct 15 '25 08:10

nowox


1 Answers

Correct regex is:

(-[db]|(?!^)\G)[^x-]*\Kx

Check this demo

As per the regex101 description:

\G - asserts position at the end of the previous match or the start of the string for the first match. \G will match start of line as well for the very first match hence there is a need of negative lookahead here (?!^)

\K - resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match. \K will discard all matched input hence we can avoid back-reference in replacement.

  • More details about \K
  • More details about \G
like image 84
anubhava Avatar answered Oct 18 '25 21:10

anubhava