I'm trying to create a new column with the calendar year's quarter end date (e.g. if today is 4 August, this quarter's end date would be 30 September).
My DataFrame has a set of dates in a column called df['dates']. Sample below:
03/08/2017
26/02/2015
31/12/2012
16/04/2014
13/04/2016
This is the code I am using, which is not working:
df['q_date'] = pd.datetime(pd.tseries.offsets.QuarterEnd(startingMonth=(pd.to_datetime(df['date']).dt.month)))
I have also tried looping through but have been unsuccessful.
Any ideas are most appreciated. Thanks.
You can simply add pd.tseries.offsets.QuarterEnd(0):
df['qdate'] = pd.to_datetime(df['date']) + pd.tseries.offsets.QuarterEnd(0)
print (df)
date qdate
0 03/08/2017 2017-03-31
1 26/02/2015 2015-03-31
2 31/12/2012 2012-12-31
3 16/04/2014 2014-06-30
4 13/04/2016 2016-06-30
Timeings:
#5000 rows
df = pd.concat([df]*1000).reset_index(drop=True)
In [19]: %timeit df.assign(q_date=[pd.Period(d, freq='Q').end_time for d in df['date']])
1 loop, best of 3: 1.84 s per loop
In [20]: %timeit pd.to_datetime(df['date']) + pd.tseries.offsets.QuarterEnd(0)
1 loop, best of 3: 913 ms per loop
#50000 rows
df = pd.concat([df]*10000).reset_index(drop=True)
In [23]: %timeit df.assign(q_date=[pd.Period(d, freq='Q').end_time for d in df['date']])
1 loop, best of 3: 18.5 s per loop
In [24]: %timeit pd.to_datetime(df['date']) + pd.tseries.offsets.QuarterEnd(0)
1 loop, best of 3: 9.13 s per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With