Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

foo.split(',').length != number of ',' found in 'foo'?

Tags:

java

string

csv

Maybe it's because it's end of day on a Friday, and I have already found a work-around, but this is killing me.

I am using Java but am .NET developer.

I have a string and I need to split it on semicolon comma. Let's say its a row in a CSV file who has 200 210 columns. line.split(',').length will be sometimes, 199, where count of ',' will be 208 OR 209. I find count in 2 different ways even to be sure (using a regex, then manually looping through and checking the character after losing my sanity).

What's the super-obvious-hit-face-on-desk thing I'm missing here? Why isn't foo.split(delim).length == CountOfOccurences(foo,delim) all the time, only sometimes?

thanks much

like image 228
dferraro Avatar asked Dec 09 '25 22:12

dferraro


1 Answers

First, there's an obvious difference of one. If there are 200 columns, all with text, there are 199 commas. Second, Java drops trailing empty strings by default. You can change this by passing a negative number as the second argument.

"foo,,bar,baz,,".split(",")

is:

{foo,,bar,baz}

an array of 4 elements. But

"foo,,bar,baz,,".split(",", -1)

is::

{foo,,bar,baz,,}

with all 6.

Note that only trailing empty strings are dropped by default.

Finally, don't forget that the String is compiled into a regex. This is not be applicable here, since , is not a special character, but you should keep it in mind.

like image 68
Matthew Flaschen Avatar answered Dec 12 '25 11:12

Matthew Flaschen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!