Pandas Melt function for time series data

Question

I am trying to melt my pandas data frame but I am not quiet sure how to assign the variables properly. I looked through the other examples on stack but I can't seem to find a variation matching this. My data frame (df1) looks like this :

[IN]: df1
[OUT]:
             40025.0    21201.0       30061.0   46021.0
date                
2020-08-08  0.000861    0.001292    0.000287    0.001177
2020-08-09  0.001147    0.001290    0.000344    0.001204
2020-08-10  0.001431    0.001288    0.000401    0.001231

Each column is for a different FIPS code, the values are the number of Covid cases per day (this data has been processed for future clustering) and index is a datetime index (day). The data frame is 804 columns by 470 rows. I would like my data frame to look like this:

enter image description here

I know I can make this work if I leave "date" as a column (as opposed to the index) by doing this:

df1 =df1.melt(id_vars="date", var_name="FIPS", value_name="Covid_cases")

But if I do that, then I get an error when trying to convert the "date" column as the index. I need it the index to be a datetime index because I am going to kmeans cluster the time series data and then plot time series clusters. Any input would be greatly appreciated! Thank you!

tdy · Accepted Answer

If date is currently the index, you should be able to reset_index() and then set_index('date') afterwards:

df1 = (df1
    .reset_index()
    .melt(id_vars='date', var_name='FIPS', value_name='Covid_cases')
    .set_index('date')
)

               FIPS  Covid_cases
date                            
2020-08-08  40025.0     0.000861
2020-08-09  40025.0     0.001147
2020-08-10  40025.0     0.001431
2020-08-08  21201.0     0.001292
2020-08-09  21201.0     0.001290
2020-08-10  21201.0     0.001288
2020-08-08  30061.0     0.000287
2020-08-09  30061.0     0.000344
2020-08-10  30061.0     0.000401
2020-08-08  46021.0     0.001177
2020-08-09  46021.0     0.001204
2020-08-10  46021.0     0.001231

Nk03 · Answer

or you can do this via stack.

df = (
    df.stack()
    .reset_index()
    .rename(columns={'level_1': 'FIPS', 0: 'Covid_cases'})
    .set_index('date')
)

Pandas Melt function for time series data

Tags:

python

pandas

dataframe

Rachel Cyr

2 Answers

tdy

Nk03

Recent Activity

Donate For Us

Pandas Melt function for time series data

Tags:

python

pandas

dataframe

Rachel Cyr

2 Answers

tdy

Nk03

Related questions

Recent Activity

Donate For Us