I'm doing a python script to clean a CSV file we receive from Qualtrics for an entrepreneurship competition.
So far, I've sliced the data and I wrote it back in an Excel file with Pandas. However, I have some columns that I would need to create new rows with. For example for each team submission we have
Team Name Nb of teammates Team Leader One Team Leader Two
1 x 2 Joe Joey
2 y 1 Jack
...
I would need to return
Team Name Nb of teammates Team Leader
1 x 2 Joe
2 Joey
3 y 1 Jack
...
This is a very simplified example of the real data I have, because there's more column, but I was wondering how I could do that in Pandas/Python.
I'm aware of these discussions on Inserting Row and Indexing: Setting with enlargement, but I don't know what should I do.
Thanks for your help !
you can use melt:
#set up frame
df =pd.DataFrame({'Team Name':['x','y'], 'Nb of teammates':[2,1], 'Team Leader One':['Joe','Jack'],'Team Leader Two':['Joey',None]})
Melt the frame:
pd.melt(df,id_vars=['Team Name','Nb of teammates'],value_vars=['Team Leader One','Team Leader Two']).dropna()
returns:
Team Name Nb of teamates variable value
0 x 2 Team Leader One Joe
1 y 1 Team Leader One Jack
2 x 2 Team Leader Two Joey
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With