I would like to analyze statistics per cars which were repairs and which are new. Data sample is:
Name IsItNew ControlDate
Car1 True 31/01/2018
Car2 True 28/02/2018
Car1 False 15/03/2018
Car2 True 16/04/2018
Car3 True 30/04/2018
Car2 False 25/05/2018
Car1 False 30/05/2018
So, I should groupby by Name and if there is a False in IsItNew column I should set False and the first date, when False was happened.
I tried groupby with nunique():
df = df.groupby(['Name','IsItNew', 'ControlDate' ])['Name'].nunique()
But, it returns count of unique items in each group.
How can I receive only grouped unique items without any count?
Actual result is:
Name IsItNew ControlDate
Car1 True 31/01/2018 1
False 15/03/2018 1
30/05/2018 1
Car2 True 28/02/2018 1
16/04/2018 1
False 25/05/2018 1
Car3 True 30/04/2018 1
Expected Result is:
Name IsItNew ControlDate
Car1 False 15/03/2018
Car2 False 25/05/2018
Car3 True 30/04/2018
I'd appreciate for any idea. Thanks)
One way to do it would be to GroupBy the Name, and aggregate on IsItNew with two functions. A custom one using any to check if there are any False values, and idxmin, to find the index of the first False, which you can later on use to index the dataframe on ControlDate:
df_ = df.groupby('Name').agg({'IsItNew':
{'IsItNew':lambda x: ~(~x).any(),
'ControlDate':'idxmin'}})
.droplevel(0, axis=1)
.reset_index()
df_['ControlDate'] = df.loc[df_['ControlDate'].values, 'ControlDate'].reset_index(drop=True)
xName IsItNew ControlDate
0 Car1 False 15/03/2018
1 Car2 False 25/05/2018
2 Car3 True 30/04/2018
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With