Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read data from CSV if contains more than excepted separators?

Tags:

java

csv

I use CsvJDBC for read data from a CSV. I get CSV from web service request, so not loaded from file. I adjust these properties:

Properties props = new java.util.Properties();
props.put("separator", ";"); // separator is a semicolon
props.put("fileExtension", ".txt"); // file extension is .txt
props.put("charset", "UTF-8"); // UTF-8

My sample1.txt contains these datas:

code;description
c01;d01
c02;d02

my sample2.txt contains these datas:

code;description
c01;d01
c02;d0;;;;;2

It is optional for me deleted headers from CSV. But not optional for me change semi-colon separator.

EDIT: My query for resultSet: SELECT * FROM myCSV

I want to read code column in sample1.txt and sample2.txt with:

resultSet.getString(1)

and read full description column with many semi-colons (d0;;;;;2). Is it possible with CsvJdbc driver or need to change driver?

Thank you any advice!

like image 579
herry Avatar asked Dec 03 '25 04:12

herry


1 Answers

This is a problem that occurs when you have messy, invalid input, which you need to try to interpret, that's being read by a too-high-level package that only handles clean input. A similar example is trying to read arbitrary HTML with an XML parser - close, but no cigar.

You can guess where I'm going: you need to pre-process your input.

The preprocessing may be very easy if you can make some assumptions about the data - for example, if there are guaranteed to be no quoted semi-colons in the first column.

like image 132
Ed Staub Avatar answered Dec 04 '25 18:12

Ed Staub



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!