import pandas as pd
df = pd.DataFrame({'x':[1,2,1,2,1,3,2],'y':[34,23,23,65,45,12,28],'z':['a','b','a','','a','c','b']})
df.groupby('x').z.count().reset_index()
x z 0 1 3 1 2 3 2 3 1
But this is what I don't want, the empty string should not be in the count, so what I want is
x z 0 1 3 1 2 2 2 3 1what should I do?
in python the empty string is not considered null, you can replace it as null and do the same.
df['z']=df['z'].replace({'':np.NAN})
df.groupby('x').z.count().reset_index()
Using replace:
df.replace({'z':''},np.nan).groupby('x').z.count().reset_index()
x z
0 1 3
1 2 2
2 3 1
Or
df.replace({'z':''},np.nan).groupby('x',as_index=False).z.count()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With