Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change month numbers to month name in a dataframe (PySpark)

I have a column of month numbers in a dataframe and want to change it to month name, so I used this which resulted in a type error:

df['monthName'] = df['monthNumber'].apply(lambda x: calendar.month_name[x]) 

TypeError: 'Column' object is not callable

How to get month name?

I'm using Spark 2.1.1 and Python 2.7.6.

This is my code for Airline data Analysis:

df_withDelay = df_mappedCarrierNames.filter(df_mappedCarrierNames.ArrDelay > 0)
sqlContext.registerDataFrameAsTable(df_withDelay,"SFO_ArrDelayAnalysisTable")
df_SFOArrDelay = sqlContext.sql \
                      ("select sfo.Month, sum(sfo.ArrDelay) as TotalArrivalDelay \
                      from SFO_ArrDelayAnalysisTable sfo \
                      where (sfo.Dest = 'SFO') \
                      group by sfo.Month")

I am trying to plot a graph with Month vs ArrDelay. From the above code I am getting Month as number. So I tried with the below option

udf = UserDefinedFunction(lambda x: calendar.month_abbr[int(x)], StringType())
new_df_mappedCarrierNames = df_mappedCarrierNames.select(*[udf(column).alias(name) if column == name else column for column in df_mappedCarrierNames.columns])

It works but, in my graph it's not in sorted order. whereas if I use the month numbers, it is in sorted order. My issue is in finding out how to map month numbers to month names in sorted order from Jan to dec.

like image 217
anaga Avatar asked Jan 26 '26 10:01

anaga


1 Answers

I would avoid using UDFs if possible (as they don't scale well). Try the combination of to_date(), date_format() and casting to integer:

from pyspark.sql.functions import col

df = df.withColumn('monthNumber', date_format(to_date(col('monthName'), 'MMMMM'), 'MM').cast('int'))

Details of date formatting codes: http://tutorials.jenkov.com/java-internationalization/simpledateformat.html

like image 103
Krzysztof Przysowa Avatar answered Jan 28 '26 23:01

Krzysztof Przysowa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!