Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dataframe with one date and three distinct values: how can I get the one in the middle?

I have a dataframe containing the daily number of downloads for two apps. However every day I have 3 different download numbers: paid downloads (the highest value), organic downloads (the smallest value) and others (the middle value).

They are not labeled, so the only thing I know is that I need to order those three values and get the one in the middle. The original dataset looks like this:

id date downloads
100 2018-01-05 2000
100 2018-01-05 45000
100 2018-01-05 44000
110 2018-01-05 3000
110 2018-01-05 7000
110 2019-01-05 8000
100 2018-01-06 9000
100 2019-01-06 77000
100 2020-01-06 75000
110 2018-01-06 1000
110 2019-01-06 6000
110 2020-01-06 9000

And the final result I need would look like this:

id date downloads
100 2018-01-05 44000
110 2018-01-05 7000
100 2018-01-06 75000
110 2018-01-06 6000
like image 476
Marcos Dias Avatar asked Sep 05 '25 03:09

Marcos Dias


1 Answers

Use groupby to take the second element with nth:

df.groupby(['id', 'date'], as_index=False).nth(1)
like image 158
mozway Avatar answered Sep 07 '25 17:09

mozway