Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get integer value with leading zero in Spark (Scala)

I have spark dataframe and and trying to add Year, Month and Day columns to it. But the problem is after adding the YTD columns it does not keeps the leading zero with the date and month columns.

val cityDF= Seq(("Delhi","India"),("Kolkata","India"),("Mumbai","India"),("Nairobi","Kenya"),("Colombo","Srilanka"),("Tibet","China")).toDF("City","Country")
val dateString = "2020-01-01"
val dateCol = org.apache.spark.sql.functions.to_date(lit(dateString))
val finaldf = cityDF.select($"*", year(dateCol).alias("Year"), month(dateCol).alias("Month"), dayofmonth(dateCol).alias("Day"))

output screenshot

I want to keep the leading zero from the Month and Day columns but it is giving me result as 1 instead of 01.
As I am using year month date columns for the spark partition creation. so I want to keep the leading zeros intact. So my question is: How do I keep the leading zero in my dataframe columns.

like image 379
Stark Avatar asked Nov 28 '25 20:11

Stark


2 Answers

Integer type can be converted to String type, where leading zeroes are possibe, with "format_string" function:

val finaldf =
  cityDF
    .select($"*",
      year(dateCol).alias("Year"),
      format_string("%02d", month(dateCol)).alias("Month"),
      format_string("%02d", dayofmonth(dateCol)).alias("Day")
    )
like image 157
pasha701 Avatar answered Dec 01 '25 11:12

pasha701


Why not simply use date_format for that?

val finaldf = cityDF.select(
                     $"*", 
                     year(dateCol).alias("Year"), 
                     date_format(dateCol, "MM").alias("Month"), 
                     date_format(dateCol, "dd").alias("Day")
              )
like image 21
blackbishop Avatar answered Dec 01 '25 10:12

blackbishop