Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing CSV file using Spark and scala - empty quotes instead of Null values

I'm using spark 2.4.1 and scala, and trying to write DF to csv file. it seems that in case of null values ,the csv contains "". Is it possible to remove those empty quotes?

 val data = Seq(
      Row(1, "a"),
      Row(5, "z"),
      Row(5, null)
    )

    val schema = StructType(
      List(
        StructField("num", IntegerType, true),
        StructField("letter", StringType, true)
      )
    )

    var df = spark.createDataFrame(
      spark.sparkContext.parallelize(data),
      schema
    )
  df.write.csv("location/")

The output seems like:

1,a
5,z
5,""

And I want it will be:

1,a
5,z
5,

What should I do?

Thanks!

like image 506
Ben Haim Shani Avatar asked Oct 17 '25 02:10

Ben Haim Shani


1 Answers

You can use options of the writer see CSV specific options(SaveMode is not related to answer);

 df.write
   .option("nullValue", null)
   .mode(SaveMode.Overwrite)
   .csv("location/")
like image 114
OldWolfs Avatar answered Oct 18 '25 20:10

OldWolfs



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!