Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert groups of columns into rows in Pandas?

Say I have a DataFrame as the following:

Name Time worked in 1st hr Time wasted in 1st hr Time worked in 2nd hr Time wasted in 2nd hr
foo 45 15 40 20
bar 35 25 55 5
baz 50 10 45 15

I wish to use the melt on the 1st hour columns and 2nd hour columns to make it look like this:

Name Hour number Time worked in the hr Time wasted in the hr
foo 1 45 15
foo 2 40 20
bar 1 35 25
bar 2 55 5
baz 1 50 10
baz 2 45 15

How would I group the "Time worked in 1st hr" and "Time wasted in 1st hr" together such that I can melt them both into the same row?

like image 760
crysoar Avatar asked Sep 05 '25 16:09

crysoar


2 Answers

You can use:

df1 = df.set_index('Name')
df1.columns = df1.columns.str.split('in', expand=True)

df2 = (df1.stack()
          .sort_index(axis=1, ascending=False)
          .rename_axis(index=['Name', 'Hour number'])
          .add_suffix('in the hr')
          .reset_index()
      )

df2['Hour number'] = df2['Hour number'].str.extract(r'(\d+)')

Result:

print(df2)

  Name Hour number  Time worked in the hr  Time wasted in the hr
0  foo           1                     45                     15
1  foo           2                     40                     20
2  bar           1                     35                     25
3  bar           2                     55                      5
4  baz           1                     50                     10
5  baz           2                     45                     15
like image 180
SeaBean Avatar answered Sep 07 '25 16:09

SeaBean


Something like:

import numpy as np
df = df.set_index('Name')
df.columns = pd.MultiIndex.from_arrays([np.repeat([1,2], len(df.columns)//2), np.tile(['worked', 'wasted'], len(df.columns)//2)])
df.stack(level=0)

NB. I couldn't test the code

like image 33
mozway Avatar answered Sep 07 '25 17:09

mozway