I have a df like this:
ID A1 A2 A3 A4 A5
1 1 2 3
2 1 2 3
3 2 1
4 3 1 2
5
For every ID, I have 5 columns A1 to A5 (In real I have many more) and the values are top priority for a particular ID.
For example: ID 1 has A1, A3 and A5 as priorites, , ID 3 has only 2 A2 and A1 and ID 5 has no Priorities
Resultant DF
ID Priority_1 Priority_2 Priority_3
1 A1 A3 A5
2 A1 A2 A4
3 A2 A1
4 A3 A5 A1
5
I am trying to same using melt and pivot using this and this_1 and many more, but exactly not able to get the same resultant df.
Any help on this or clarity from my side!!
Use DataFrame.melt with sorting by DataFrame.sort_values and removing missing rows by DataFrame.dropna, then add new column used for filtering by boolean indexing and Series.le for less or equal and last use DataFrame.pivot with DataFrame.add_prefix, last add DataFrame.reindex for added only mising rows ID:
N = 3
df1 = df.melt('ID').sort_values(['ID','value']).dropna(subset=['value'])
df1['new'] = df1.groupby('ID').cumcount().add(1)
df1 = df1[df1['new'].le(N)]
df2 = df1.pivot('ID','new','variable').add_prefix('Priority_').reindex(df['ID'])
print (df2)
new Priority_1 Priority_2 Priority_3
ID
1 A1 A3 A5
2 A1 A2 A4
3 A2 A1 NaN
4 A3 A5 A1
5 NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With