Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Record the largest series for each id in python

I want to to keep one record that has the largest series for each id. So for each id I need one row. I think I need something like

df_new = df.groupby('id')['series'].nlargest(1)

, but that's definitely wrong.

That's how my dataset looks:

id  series s1 s2 s3
1   2      4  9  1
1   8      6  2  2
1   3      9  1  3
2   9      4  1  5
2   2      2  5  5
2   5      1  7  8
3   6      7  2  3
3   2      4  4  1
3   1      3  9  9

This should be the result:

id  series s1 s2 s3
1   8      6  2  2
2   9      4  1  5
3   6      7  2  3
like image 520
matthew Avatar asked Nov 21 '25 08:11

matthew


1 Answers

IIUC you want to groupby on 'id' column and get the index label where the 'Series' value is the largest using idxmax() and use this to index back in the orig df:

In [91]:
df.loc[df.groupby('id')['series'].idxmax()]

Out[91]:
   id  series  s1  s2  s3
1   1       8   6   2   2
3   2       9   4   1   5
6   3       6   7   2   3
like image 78
EdChum Avatar answered Nov 22 '25 23:11

EdChum



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!