Trying to stack several groups of columns into multiple target columns by name

Question

I have the original dataframe like that which contains 1772 columns and 130 rows. I would like to stack them into multiple target columns.

id	AA_F1R1	BB_F1R1	AA_F1R2	BB_F1R2	...	AA_F2R1	BB_F2R2	...	AA_F7R25	BB_F7R25
001	5	xy	xx	xx	zy	1		4	xx
002	6	zzz	yyy	zzz	xw	2	zzz	3	zzz

I found two different solutions that seem to work but for me is giving an error. Not sure if they work with NaN values.

pd.wide_to_long(df, stubnames=['AA', 'BB'], i='id', j='dropme', sep='_')\
  .reset_index()\
  .drop('dropme', axis=1)\
  .sort_values('id')
Output:
0 rows × 1773 columns

Another solution I tried was

df.set_index('id', inplace=True)
df.columns = pd.MultiIndex.from_tuples(tuple(df.columns.str.split("_")))
df.stack(level = 1).reset_index(level = 1, drop = True).reset_index()

Output:
150677 rows × 2 columns

the problem with this last one is I couldn't keep the columns I wanted.

I appreciate any inputs!

jezrael · Accepted Answer

Use suffix=r'\w+' parameter in wide_to_long:

df = pd.wide_to_long(df, stubnames=['AA','BB'], i='id', j='dropme', sep='_', suffix=r'\w+')\
  .reset_index()\
  .drop('dropme', axis=1)\
  .sort_values('id')

In second solution add dropna=False to DataFrame.stack:

df.set_index('id', inplace=True)
df.columns = df.columns.str.split("_", expand=True)
df = df.stack(level = 1, dropna=False).reset_index(level = 1, drop = True).reset_index()

Trying to stack several groups of columns into multiple target columns by name

Tags:

python

pandas

dataframe

Helena

1 Answers

jezrael

Recent Activity

Donate For Us

Trying to stack several groups of columns into multiple target columns by name

Tags:

python

pandas

dataframe

Helena

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us