I am trying to melt my pandas data frame but I am not quiet sure how to assign the variables properly. I looked through the other examples on stack but I can't seem to find a variation matching this. My data frame (df1) looks like this :
[IN]: df1
[OUT]:
40025.0 21201.0 30061.0 46021.0
date
2020-08-08 0.000861 0.001292 0.000287 0.001177
2020-08-09 0.001147 0.001290 0.000344 0.001204
2020-08-10 0.001431 0.001288 0.000401 0.001231
Each column is for a different FIPS code, the values are the number of Covid cases per day (this data has been processed for future clustering) and index is a datetime index (day). The data frame is 804 columns by 470 rows. I would like my data frame to look like this:
I know I can make this work if I leave "date" as a column (as opposed to the index) by doing this:
df1 =df1.melt(id_vars="date", var_name="FIPS", value_name="Covid_cases")
But if I do that, then I get an error when trying to convert the "date" column as the index. I need it the index to be a datetime index because I am going to kmeans cluster the time series data and then plot time series clusters. Any input would be greatly appreciated! Thank you!
If date
is currently the index, you should be able to reset_index()
and then set_index('date')
afterwards:
df1 = (df1
.reset_index()
.melt(id_vars='date', var_name='FIPS', value_name='Covid_cases')
.set_index('date')
)
FIPS Covid_cases
date
2020-08-08 40025.0 0.000861
2020-08-09 40025.0 0.001147
2020-08-10 40025.0 0.001431
2020-08-08 21201.0 0.001292
2020-08-09 21201.0 0.001290
2020-08-10 21201.0 0.001288
2020-08-08 30061.0 0.000287
2020-08-09 30061.0 0.000344
2020-08-10 30061.0 0.000401
2020-08-08 46021.0 0.001177
2020-08-09 46021.0 0.001204
2020-08-10 46021.0 0.001231
or you can do this via stack
.
df = (
df.stack()
.reset_index()
.rename(columns={'level_1': 'FIPS', 0: 'Covid_cases'})
.set_index('date')
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With