How to pivot from columns to rows in Pandas

Question

data = [[1, 'tom', 10, 53, 2, 3, 9, 6 ], [2, 'nick', 1, 53, 2, 23, 4, 7], [3, 'juli', 9, 23, 2, 31, 9, 3]]
  
df = pd.DataFrame(data, columns = ['ID', 'Name', 'Apple.Food.0', 'Apple.Food.1', 'Apple.Food.2', 'Pear.Food.0', 'Pear.Food.1', 'Pear.Food.2'])
  
df

ID	Name	Apple.Food.0	Apple.Food.1	Apple.Food.2	Pear.Food.0	Pear.Food.1	Pear.Food.2
1	tom	10	53	2	3	9	6
2	nick	1	53	2	23	4	7
3	juli	9	23	2	31	9	3

I want to unpack last 6 columns to like: i.e. Apple.Food.0 - unpack prefix 'Apple' to the rows, unpack suffix '0' to the rows, keep 'Food' as the column name and keep the value there; then regenerate the ID column as the PK.

I tried to use .str.split() to split the column, however, the value will comes with all the spited columns. Any built-in pandas function could achieve this easily?

ID	Name	Fruit	Time	Food
1	tom	Apple	0	10
2	tom	Apple	1	53
3	tom	Apple	2	2
4	tom	Pear	0	3
5	tom	Pear	1	9
6	tom	Pear	2	6
7	nick	Apple	0	1
8	nick	Apple	1	53
9	nick	Apple	2	2
10	nick	Pear	0	23
11	nick	Pear	1	4
12	nick	Pear	2	7
13	juli	Apple	0	9
14	juli	Apple	1	23
15	juli	Apple	2	2
16	juli	Pear	0	31
17	juli	Pear	1	9
18	juli	Pear	2	3

jezrael · Accepted Answer

Use DataFrame.set_index for MultiIndex with columns without . first, then str.split with columns, reshape by DataFrame.stack, check index names by DataFrame.rename_axis and last add DataFrame.reset_index:

df1 = df.set_index(['ID','Name'])
df1.columns = df1.columns.str.split('.', expand=True)
df1 = df1.stack([0,2]).rename_axis(['ID','Name','Fruit','Time']).reset_index()
print (df1)
    ID  Name  Fruit Time  Food
0    1   tom  Apple    0    10
1    1   tom  Apple    1    53
2    1   tom  Apple    2     2
3    1   tom   Pear    0     3
4    1   tom   Pear    1     9
5    1   tom   Pear    2     6
6    2  nick  Apple    0     1
7    2  nick  Apple    1    53
8    2  nick  Apple    2     2
9    2  nick   Pear    0    23
10   2  nick   Pear    1     4
11   2  nick   Pear    2     7
12   3  juli  Apple    0     9
13   3  juli  Apple    1    23
14   3  juli  Apple    2     2
15   3  juli   Pear    0    31
16   3  juli   Pear    1     9
17   3  juli   Pear    2     3

Performance:

#3k rows
df = pd.concat([df] * 1000, ignore_index=True)

def f1():
    df1 = df.set_index(['ID','Name'])
    df1.columns = df1.columns.str.split('.', expand=True)
    df1 = df1.stack([0,2]).rename_axis(['ID','Name','Fruit','Time']).reset_index()
    
def f2():
    x = df.melt(['ID', 'Name'], value_name='Food')
    x[['Fruit', 'Time']] = x.variable.str.split('.', expand=True)[[0,2]]
    x = x.sort_values(['ID', 'Fruit']).reset_index(drop=True).drop('variable', 1)
    

In [41]: %timeit f1()
27.4 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [42]: %timeit f2()
53.1 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

How to pivot from columns to rows in Pandas

Tags:

python

pandas

Hello World

1 Answers

jezrael

Recent Activity

Donate For Us

How to pivot from columns to rows in Pandas

Tags:

python

pandas

Hello World

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us