Read spark csv with empty values without converting to null

Question

Consider a csv:

Name,Color
Apple,""

 val df = spark.read
.option("header", "true")
.option("inferSchema", "true")
.option("treatEmptyValuesAsNulls","false")
.csv(mycsv)

This still gives:

+--------+----------+
|Name    |Color     |
+--------+----------+
|   Apple|      null|
+--------+----------+

Expected was:

+--------+----------+
|Name    |Color     |
+--------+----------+
|   Apple|          |
+--------+----------+

Oli · Accepted Answer

AFAIK, the option "treatEmptyValuesAsNulls" does not exist. See the doc for more details. Two other options may be of interest to you though. emptyValue and nullValue. By default, they are both set to "" but since the null value is possible for any type, it is tested before the empty value that is only possible for string type. Therefore, empty strings are interpreted as null values by default. If you set nullValue to anything but "", like "null" or "none", empty strings will be read as empty strings and not as null values anymore.

Read spark csv with empty values without converting to null

Tags:

dataframe

apache-spark

apache-spark-sql

supernatural

1 Answers

Oli

Recent Activity

Donate For Us

Read spark csv with empty values without converting to null

Tags:

dataframe

apache-spark

apache-spark-sql

supernatural

1 Answers

Oli

Related questions

Recent Activity

Donate For Us