Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the number of quarters between two dates

Question 1:

How would you count the number of quarters between the starting date (base_date variable) and the y column in the pandas dataframe?

from dateutil.relativedelta import relativedelta
import pandas as pd
import numpy as np

base_date=pd.to_datetime('07/01/2019')
fake_data=pd.DataFrame([(x,pd.to_datetime('04/01/2020')+relativedelta(months=y)) for x in list(range(0,100)) for y in list(range(0,100))],columns=['x','y'])
fake_data['z']=fake_data['x']*np.random.uniform(low=1,high=1000)

UPDATE: one option seems to work for question1:

fake_data['month_diff']=(fake_data['y'].dt.year-base_date.year)*12+fake_data['y'].dt.month-base_date.month
fake_data['quarter']=(fake_data['month_diff']/3).astype(int)

Question 2:

If the x and y variables are now set to a pandas dataframe multi-level index. How would you perform the same task as in question 1?

fake_data.set_index(['x','y'],drop=True,inplace=True)
like image 489
kdb Avatar asked Oct 21 '25 05:10

kdb


2 Answers

This is a pretty neat way to use pandas built-in period differencing:

import pandas as pd
t = pd.to_datetime('2025Q4').to_period(freq='Q') - pd.to_datetime('1850Q2').to_period(freq='Q')
print(t.n)  # -> 702

For dates in a series, you will need to use a lambda:

import pandas as pd
# Setup
data = {'end': ['2023-01-01', '2023-04-01', '2023-07-01'],
        'start': ['2022-12-01', '2023-01-01', '2023-01-01']}
df = pd.DataFrame(data)
df['end'] = pd.to_datetime(df['end'])
df['start'] = pd.to_datetime(df['start'])

#Compute the difference in quarters
diff = (df['end'].dt.to_period('Q')-df['start'].dt.to_period('Q')).apply(lambda x: x.n)
like image 184
dlm Avatar answered Oct 24 '25 09:10

dlm


Question 1:

A year contains 4 quarters. A simple way to count the difference of quarters between to dates is to convert them to year * 4 + quarter and use the difference:

fake_data.y.dt.year * 4 + fake_data.y.dt.quarter - (base_date.year * 4 + base_date.quarter)

Question 2:

The previous method still applies, but on index.levels[1]:

fake_data.index.levels[1].year * 4 + fake_data.index.levels[1].quarter - (base_date.year * 4 + base_date.quarter)

Simply in first question you get a plain Series, while here it is an Index...

like image 45
Serge Ballesta Avatar answered Oct 24 '25 10:10

Serge Ballesta