I was just wondering if I could perform summation over multiple rows if they have same value for one column. For example, let's say I have a dataframe A:
A:
col1, col2, col3, col4
A 0.1 0.2 0.3
B 0.4 0.5 0.6
A 0.7 0.8 0.9
C 1.0 1.1 1.2
The end result should be:
col1, col2, col3, col4
A 0.8 1.0 1.2
B 0.4 0.5 0.6
C 1.0 1.1 1.2
This is because the first and third rows of the dataframe have the same value (A) for col1... How am I supposed to implement this?
In [83]: A.set_index('col1').sum(level=0)
Out[83]:
col2 col3 col4
col1
A 0.8 1.0 1.2
B 0.4 0.5 0.6
C 1.0 1.1 1.2
or
In [152]: A.set_index('col1').sum(level=0).reset_index()
Out[152]:
col1 col2 col3 col4
0 A 0.8 1.0 1.2
1 B 0.4 0.5 0.6
2 C 1.0 1.1 1.2
Use groupby with aggregation sum:
df1 = df.groupby('col1', as_index=False).sum()
print (df1)
col1 col2 col3 col4
0 A 0.8 1.0 1.2
1 B 0.4 0.5 0.6
2 C 1.0 1.1 1.2
df1 = df.groupby('col1').sum().reset_index()
print (df1)
col1 col2 col3 col4
0 A 0.8 1.0 1.2
1 B 0.4 0.5 0.6
2 C 1.0 1.1 1.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With