I have a DataFrame and I am using .aggregate({'col1': np.sum}), this will perform a summation of the values in col1 and aggregate them together. Is it possible to perform a count, something like .aggregate({'col1': some count function here})?
You can use 'size', 'count', or 'nunique' depending on your use case. The differences between them being:
'size': the count including NaN and repeat values.'count': the count excluding NaN but including repeats.'nunique': the count of unique values, excluding repeats and NaN.For example, consider the following DataFrame:
df = pd.DataFrame({'col0': list('aabbcc'), 'col1': [1, 1, 2, np.nan, 3, 4]})
col0 col1
0 a 1.0
1 a 1.0
2 b 2.0
3 b NaN
4 c 3.0
5 c 4.0
Then using the three functions described:
df.groupby('col0')['col1'].agg(['size', 'count', 'nunique'])
size count nunique
col0
a 2 2 1
b 2 1 1
c 2 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With