Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create data frame with month start and end in Python

Tags:

python

pandas

I want to create a pandas dataframe from a given start and end date:

import pandas as pd
from pandas.tseries.offsets import MonthEnd
start_date = "2020-05-17"
end_date = "2020-07-23"

For each row in this dataframe, I should have the start day and end day of the month, so the expected output is:

start       end         month   year
2020-05-17  2020-05-31  May     2020
2020-06-01  2020-06-30  June    2020
2020-07-01  2020-07-23  July    2020

I know I have to loop over each month between the interval created by start_date and end_date. While I know how to extract the last day in a date:

def last_day(date: str):
    return pd.Timestamp(date) + MonthEnd(1)

I'm stuck over how to run this over the interval. Any suggestion will be appreciated.

like image 929
Alexis Avatar asked Dec 22 '25 09:12

Alexis


2 Answers

You can use pd.date_range and pd.to_datetime:

start = pd.to_datetime([start_date] + pd.date_range(start_date, end_date, freq='MS').tolist())
end = pd.to_datetime(pd.date_range(start_date, end_date, freq='M').tolist() + [end_date])
month = start.strftime('%B')
year = start.year

df = pd.DataFrame({'start': start, 'end': end, 'month': month, 'year': year})

Output:

>>> df
       start        end month  year
0 2020-05-17 2020-05-31   May  2020
1 2020-06-01 2020-06-30  June  2020
2 2020-07-01 2020-07-23  July  2020
like image 56
Corralien Avatar answered Dec 23 '25 21:12

Corralien


You can transform the output from .isocalendar():

r = pd.date_range(start_date, end_date, freq="D").isocalendar()
out = (
    r.assign(month=r.index.month)
    .reset_index()
    .groupby(["year", "month"])["index"]
    .agg(("first", "last"))
    .reset_index()
)
print(out)

Prints:

   year  month      first       last
0  2020      5 2020-05-17 2020-05-31
1  2020      6 2020-06-01 2020-06-30
2  2020      7 2020-07-01 2020-07-23

To have string month names:

out = out.rename(columns={'first':'start', 'last':'end'})
out['month'] = pd.to_datetime(out['month'], format='%m').dt.strftime('%b')
print(out)

Prints:

   year month      start        end
0  2020   May 2020-05-17 2020-05-31
1  2020   Jun 2020-06-01 2020-06-30
2  2020   Jul 2020-07-01 2020-07-23
like image 30
Andrej Kesely Avatar answered Dec 23 '25 22:12

Andrej Kesely



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!