Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count the number of certain values in a data frame after group by

I have a data frame as follows:

    userID  Correct
0   1050    F
1   1050    T
2   1050    T
3   1050    F
4   1050    F
5   1050    F
6   1050    F
7   1050    F
8   1050    F
9   1050    F
10  1051    F
11  1051    F
12  1051    F
13  1051    F
14  1051    F
15  1051    T
16  1051    F
17  1051    F
18  1051    F
19  1051    T

What I want to do is to count the number of T's for the "Correct" column for every user. That is, after we grouped the data frame by userID, I want a column that has the number of T's for that user.

Here is what I have done but it clearly is wrong:

df.groupby('userID').agg({'Correct': lambda x: (x == T).count()})
like image 297
HimanAB Avatar asked Dec 03 '25 23:12

HimanAB


1 Answers

You are really close, use sum of Trues:

df1 = df.groupby('userID').agg({'Correct': lambda x: (x == 'T').sum()})
print (df1)
        Correct
userID         
1050          2
1051          2

But better is first filter and then count:

df1 = df[df['Correct'] == 'T'].groupby('userID').size().to_frame('Correct')
print (df1)
        Correct
userID         
1050          2
1051          2

For add 0 for userID with no T add reindex:

df1 = (df[df['Correct'] == 'T'].groupby('userID')
                              .size()
                              .reindex(df['userID'].unique(), fill_value=0)
                              .to_frame('Correct'))
print (df1)
        Correct
userID         
1050          2
1051          2
333           0
like image 160
jezrael Avatar answered Dec 05 '25 12:12

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!