Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine pandas pivot table multi-index headers - one timestamp, one string

I have a pandas dataframe like so:

    weatherdate publishdate weathernode value
384 2018-01-30  2018-01-30  CYWG    -11.67
385 2018-01-31  2018-01-30  CYWG    -15.56
432 2018-01-30  2018-01-30  CYYZ    -8.33
433 2018-01-31  2018-01-30  CYYZ    -5.00
512 2018-01-31  2018-01-31  CYWG    -17.22
560 2018-01-31  2018-01-31  CYYZ    -6.67

Both date columns are of datetime64 format. I convert df to a pivot table like so:

dffcst = pd.pivot_table(df, index='weatherdate', columns=['weathernode','publishdate'], values='value') 

And get an output like so:

weathernode   CYWG                        CYYZ
publishdate   2018-01-30    2018-01-31  2018-01-30  2018-01-31
weatherdate             
2018-01-30    -11.67          NaN         -8.33       NaN
2018-01-31    -15.56         -17.22       -5.00      -6.67    

I would like to combine the two rows of column names above, to get an output like the following which I built in excel (note the format below is the best case scenario - remove the dashes between year/month/day, no spaces between date and airport code):

weatherdate CYWG20180130 CYWG20180131 CYYZ20180130 CYYZ20180131
1/30/2018   -11.67        NaN          -8.33        NaN
1/31/2018   -15.56        -17.22        -5          -6.67

I've tried to join them like so:

dffcst.columns = [''.join(col) for col in dffcst.columns]

But I get the following error:

TypeError: sequence item 1: expected str instance, Timestamp found

So pandas is having trouble joining the date in the first row of column names to the string in the second row of column names.

I've tried the following code:

dffcst.columns = [''.join(str(col)) for col in dffcst.columns]

It works, but the new column headers are two unwieldy for me to use, so I need an alternative solution, like what I built in Excel above. This is an example of the unwieldy column header:

('CYWG', Timestamp('2018-01-30 00:00:00'))

The goal is to export this table to a csv which would be pulled into an excel model. I could look up the excel model with an offset match so I can lookup various weather stations, on various different forecast dates. Happy to hear anyones thoughts on that idea as well.

like image 985
thesimplevoodoo Avatar asked Sep 08 '25 02:09

thesimplevoodoo


1 Answers

Convert the Timestamp objects to strings with strftime:

dffcst.columns = [c[0] + pd.datetime.strftime(c[1], '%y%m%d') for c in dffcst.columns]
like image 149
Peter Leimbigler Avatar answered Sep 09 '25 17:09

Peter Leimbigler