Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to stack several groups of columns into multiple target columns by name

I have the original dataframe like that which contains 1772 columns and 130 rows. I would like to stack them into multiple target columns.

id AA_F1R1 BB_F1R1 AA_F1R2 BB_F1R2 ... AA_F2R1 BB_F2R2 ... AA_F7R25 BB_F7R25
001 5 xy xx xx zy 1 4 xx
002 6 zzz yyy zzz xw 2 zzz 3 zzz

I found two different solutions that seem to work but for me is giving an error. Not sure if they work with NaN values.

pd.wide_to_long(df, stubnames=['AA', 'BB'], i='id', j='dropme', sep='_')\
  .reset_index()\
  .drop('dropme', axis=1)\
  .sort_values('id')
Output:
0 rows × 1773 columns

Another solution I tried was

df.set_index('id', inplace=True)
df.columns = pd.MultiIndex.from_tuples(tuple(df.columns.str.split("_")))
df.stack(level = 1).reset_index(level = 1, drop = True).reset_index()

Output:
150677 rows × 2 columns 

the problem with this last one is I couldn't keep the columns I wanted.

I appreciate any inputs!

like image 502
Helena Avatar asked Jan 20 '26 22:01

Helena


1 Answers

Use suffix=r'\w+' parameter in wide_to_long:

df = pd.wide_to_long(df, stubnames=['AA','BB'], i='id', j='dropme', sep='_', suffix=r'\w+')\
  .reset_index()\
  .drop('dropme', axis=1)\
  .sort_values('id')

In second solution add dropna=False to DataFrame.stack:

df.set_index('id', inplace=True)
df.columns = df.columns.str.split("_", expand=True)
df = df.stack(level = 1, dropna=False).reset_index(level = 1, drop = True).reset_index()
like image 110
jezrael Avatar answered Jan 22 '26 11:01

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!