How do I turn the below input data (Pandas dataframe fed from Excel file):
ID Category Speaker Price
334014 Real Estate Perspectives Tom Smith 100
334014 E&E Tom Smith 200
334014 Real Estate Perspectives Janet Brown 100
334014 E&E Janet Brown 200
into this:
ID Category Speaker Price
334014 Real Estate Perspectives Tom Smith, Janet Brown 100
334014 E&E Tom Smith, Janet Brown 200
So basiscally I want to group by Category, concatenate the Speakers, but not aggregate Price.
I tried different approaches with Pandas dataframe.groupby()
and .agg()
, but to no avail. Maybe there is simpler pure Python solution?
There are 2 possible solutions - aggregate by multiple columns and join
:
dataframe.groupby(['ID','Category','Price'])['Speaker'].apply(','.join)
Or need aggregate only Price
column, then is necessary aggregate all columns by first
or last
:
dataframe.groupby('Price').agg({'Speaker':','.join, 'ID':'first', 'Price':'first'})
Try this
df.groupby(['ID','Category'],as_index=False).agg(lambda x : x if x.dtype=='int64' else ', '.join(x))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With