Say I have a DataFrame as the following:
Name | Time worked in 1st hr | Time wasted in 1st hr | Time worked in 2nd hr | Time wasted in 2nd hr |
---|---|---|---|---|
foo | 45 | 15 | 40 | 20 |
bar | 35 | 25 | 55 | 5 |
baz | 50 | 10 | 45 | 15 |
I wish to use the melt on the 1st hour columns and 2nd hour columns to make it look like this:
Name | Hour number | Time worked in the hr | Time wasted in the hr |
---|---|---|---|
foo | 1 | 45 | 15 |
foo | 2 | 40 | 20 |
bar | 1 | 35 | 25 |
bar | 2 | 55 | 5 |
baz | 1 | 50 | 10 |
baz | 2 | 45 | 15 |
How would I group the "Time worked in 1st hr" and "Time wasted in 1st hr" together such that I can melt them both into the same row?
You can use:
df1 = df.set_index('Name')
df1.columns = df1.columns.str.split('in', expand=True)
df2 = (df1.stack()
.sort_index(axis=1, ascending=False)
.rename_axis(index=['Name', 'Hour number'])
.add_suffix('in the hr')
.reset_index()
)
df2['Hour number'] = df2['Hour number'].str.extract(r'(\d+)')
Result:
print(df2)
Name Hour number Time worked in the hr Time wasted in the hr
0 foo 1 45 15
1 foo 2 40 20
2 bar 1 35 25
3 bar 2 55 5
4 baz 1 50 10
5 baz 2 45 15
Something like:
import numpy as np
df = df.set_index('Name')
df.columns = pd.MultiIndex.from_arrays([np.repeat([1,2], len(df.columns)//2), np.tile(['worked', 'wasted'], len(df.columns)//2)])
df.stack(level=0)
NB. I couldn't test the code
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With