Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java - explain this regular expression (",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1)

Tags:

java

regex

I am separating a string "foo,bar,c;qual="baz,blurb",d;junk="quux,syzygy"" by commas but want to keep the commas in the quotes. This question was answered in this Java: splitting a comma-separated string but ignoring commas in quotes question but it fails to fully explain how the poster created this piece of code which is:

line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);

OK so I do understand some of what is going on but there is a bit that is confusing me. I know the first comma is for matching.

Then

        (?= 

is a forward search.

Then the first part is grouped

  ([^\"]*\"[^\"]*\"). 

This where I get confused. So the first part

  [^\"]* 

means that beginning of any line with quotes separate tokens zero or more times.

Then comes \". Now is this like opening a quote in string or is it saying match this quote?

Then it repeats the exact same line of code, why?

      ([^\"]*\"[^\"]*\")

In the second part adds the same code again to explain that it must finish with quotes.

Can someone explain the part i am not getting?

like image 452
spaga Avatar asked May 31 '26 00:05

spaga


1 Answers

[^\"] is any string without ". \" matches ". So basically ([^\"]*\"[^\"]*\") matches a string that contains 2 " and the last character is ".

like image 64
M. Shaw Avatar answered Jun 01 '26 13:06

M. Shaw