I have a dataframe as follows:
ID Date Text
1 01/01/2019 abcd
1 01/01/2019 pqrs
2 01/02/2019 abcd
2 01/02/2019 xyze
I want to merge Text by ID in Python using group by clause.
I want to merge 'Text' columns by grouping ID.
ID Date Text
1 01/01/2019 abcdpqrs
2 01/02/2019 abcdxyze
I want to do this in Python.
I have attempted following code chunks but it didn't work:
groups = groupby(dataset_new, key=ID(1))
dataset_new.group_by{row['Reference']}.values.each do |group|
puts [group.first['Reference'], group.map{|r| r['Text']} * ' '] * ' | '
end
I also attempted to merge text in excel using formulas but it is also not giving required results.
Try groupby and sum. Judging from your error message and the output of df.info() it seems there are mixed dtypes and NaN in column Text. I suggest converting NaN to empty string using fillna(''), then convert all elements in the column to string using astype(str).
df = pd.DataFrame({'ID': [1,1,2,2],
'Date': ['01/01/2019', '01/01/2019', '01/02/2019', '01/02/2019'],
'Text': ['abcd', 'pqrs', 'abcd', 'xyze']})
df['Text'] = df['Text'].fillna('').astype(str)
df_grouped = df.groupby(['ID', 'Date'])['Text'].sum()
print(df_grouped)
This should return
ID Date
1 01/01/2019 abcdpqrs
2 01/02/2019 abcdxyze
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With