Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get weekday name from date in PySpark

I use this code to return the day name from a date of type string:

import Pandas as pd
df = pd.Timestamp("2019-04-10")
print(df.weekday_name)

so when I have "2019-04-10" the code returns "Wednesday"

I would like to apply it a column in PySpark DataFrame to get the day name in text. But it doesn't seem to work.

+-------------+
|Reported Date|
+-------------+
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|
|    1/07/2010|    
+-------------+

I tried to do this:

sparkDF.withColumn("day", weekday_name(pd.Timestamp('Reported Date')))

But I get an Error massage:

NameError: name 'weekday_name' is not defined

like image 468
leena Avatar asked Dec 14 '25 18:12

leena


1 Answers

PySpark documentation is a bit unclear on this topic but it internally uses Java Date formats.

you can use like this:

df.show()
+----------+
|      date|
+----------+
|2010-01-07|
+----------+

df.printSchema()
root
 |-- date: date (nullable = true)

Now, To get the short name of weekday we can use E/EE/EEE and if you want the full name of then to need to give more than 3Es like EEEE

Short form:

import pyspark.sql.functions as f

df.withColumn('Day', f.date_format('date', 'E')).show()
+----------+---+
|      date|Day|
+----------+---+
|2010-01-07|Thu|
+----------+---+

Full:

df.withColumn('Day', f.date_format('date', 'EEEE')).show()
+----------+--------+
|      date|     Day|
+----------+--------+
|2010-01-07|Thursday|
+----------+--------+

like image 171
SMaZ Avatar answered Dec 19 '25 06:12

SMaZ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!