Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: counting unique datetime values in group by gives weird values

So I got this DataFrame, built in a way so that for column id equal to 2, we have two different values in column num and my_date:

import pandas as pd

a = pd.DataFrame({'id': [1, 2, 3, 2], 
                  'my_date': [datetime(2017, 1, i) for i in range(1, 4)] + [datetime(2017, 1, 1)],
                  'num': [2, 3, 1, 4]
            })

For convenience, this is the DataFrame in a readable visual:

dataframe

If I want to count the number of unique values for each id, I'd do

grouped_a = a.groupby('id').agg({'my_date': pd.Series.nunique, 
                                 'num': pd.Series.nunique}).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']

which gives this weird (?) result:

grouped_dataframe

Looks like the counting unique values on the datetime (which in Pandas converts to a datetime64[ns]) type is not working?

like image 798
mar tin Avatar asked Oct 26 '25 11:10

mar tin


1 Answers

It is bug, see github 14423.

But you can use SeriesGroupBy.nunique which works nice:

grouped_a = a.groupby('id').agg({'my_date': 'nunique', 
                                 'num': 'nunique'}).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']
print (grouped_a)
   id  num_unique_num  num_unique_my_date
0   1               1                   1
1   2               2                   2
2   3               1                   1

If DataFrame have only 3 columns, you can use:

grouped_a = a.groupby('id').agg(['nunique']).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']
print (grouped_a)
   id  num_unique_num  num_unique_my_date
0   1               1                   1
1   2               2                   2
2   3               1                   1
like image 184
jezrael Avatar answered Oct 29 '25 01:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!