Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read spark csv with empty values without converting to null

Consider a csv:

Name,Color
Apple,""
 val df = spark.read
.option("header", "true")
.option("inferSchema", "true")
.option("treatEmptyValuesAsNulls","false")
.csv(mycsv)

This still gives:

+--------+----------+
|Name    |Color     |
+--------+----------+
|   Apple|      null|
+--------+----------+

Expected was:

+--------+----------+
|Name    |Color     |
+--------+----------+
|   Apple|          |
+--------+----------+
like image 502
supernatural Avatar asked Feb 02 '26 01:02

supernatural


1 Answers

AFAIK, the option "treatEmptyValuesAsNulls" does not exist. See the doc for more details. Two other options may be of interest to you though. emptyValue and nullValue. By default, they are both set to "" but since the null value is possible for any type, it is tested before the empty value that is only possible for string type. Therefore, empty strings are interpreted as null values by default. If you set nullValue to anything but "", like "null" or "none", empty strings will be read as empty strings and not as null values anymore.

like image 52
Oli Avatar answered Feb 04 '26 14:02

Oli



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!