So I got this DataFrame, built in a way so that for column id equal to 2, we have two different values in column num and my_date:
import pandas as pd
a = pd.DataFrame({'id': [1, 2, 3, 2],
'my_date': [datetime(2017, 1, i) for i in range(1, 4)] + [datetime(2017, 1, 1)],
'num': [2, 3, 1, 4]
})
For convenience, this is the DataFrame in a readable visual:

If I want to count the number of unique values for each id, I'd do
grouped_a = a.groupby('id').agg({'my_date': pd.Series.nunique,
'num': pd.Series.nunique}).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']
which gives this weird (?) result:

Looks like the counting unique values on the datetime (which in Pandas converts to a datetime64[ns]) type is not working?
It is bug, see github 14423.
But you can use SeriesGroupBy.nunique which works nice:
grouped_a = a.groupby('id').agg({'my_date': 'nunique',
'num': 'nunique'}).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']
print (grouped_a)
id num_unique_num num_unique_my_date
0 1 1 1
1 2 2 2
2 3 1 1
If DataFrame have only 3 columns, you can use:
grouped_a = a.groupby('id').agg(['nunique']).reset_index()
grouped_a.columns = ['id', 'num_unique_num', 'num_unique_my_date']
print (grouped_a)
id num_unique_num num_unique_my_date
0 1 1 1
1 2 2 2
2 3 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With