What is the oldest leap year in pandas?

Question

I'm working with a day of year column that ranges from 1 to 366 (to account for leap years). I need to convert this column into a date for a specific task and I would like to set it to a year that is very unlikely to appear in my time series.

Is there a way to set it to the oldest leap year of pandas?

import pandas as pd

# here is an example where the data have already been converted to datetime object
# I just missed the year to set
dates = pd.Series(pd.to_datetime(['2023-05-01', '2021-12-15', '2019-07-20']))
first_leap_year = 2000  # this is where I don't know what to set
new_dates = dates.apply(lambda d: d.replace(year=first_leap_year))

IMSoP · Accepted Answer

The documentation for the pandas.Timestamp type says:

Timestamp is the pandas equivalent of python’s Datetime and is interchangeable with it in most cases.

So we can look up the Python documentation for datetime objects, where we find:

Like a date object, datetime assumes the current Gregorian calendar extended in both directions; like a time object, datetime assumes there are exactly 3600*24 seconds in every day.

In other words, it assumes that the current rules for calculating leap years apply at any point in history, even though they were actually introduced in 1582, and adopted by different countries over the next few centuries. (The technical term for this is a "proleptic Gregorian calendar".)

Standard Python has a datetime.MINYEAR constant:

The smallest year number allowed in a date or datetime object. MINYEAR is 1.

So the lowest year divisible by 4, (and not by 100, so meeting the Gregorian definition of leap year as well as the Julian one) would be 4.

However, Pandas also has pandas.Timestamp.min:

Timestamp.min = Timestamp('1677-09-21 00:12:43.145224193')

(In case you're wondering, that's 2⁶³ nanoseconds before January 1, 1970, i.e. the limit of a 64-bit signed integer with nanosecond resolution.)

So you probably want a year after 1677, meaning the earliest available year would be 1680.

Pierrick Rambaud · Answer

I ended up doing the same reasoning as @IMSoP but using pandas only so less general:

First I searched for the very first available year in Pandas:

import pandas as pd

start_date = pd.Timestamp.min

Then I wrote a small function to check if a year is a leap year and applied to the range of date from my first ever year to the next 50 (which is of course overkill but I felt safer):

def is_leap_year(year):
    return (year % 4 == 0 and year % 100 != 0) or (year % 400 == 0)

for year in range(start_date.year, start_date.year + 50):
    if is_leap_year(year):
        print(f"Oldest leap year in pandas: {year}")
        break

final answer is consistent with the datetime based answer (which is great):

Oldest leap year in pandas: 1680

What is the oldest leap year in pandas?

Tags:

python

pandas

Pierrick Rambaud

2 Answers

IMSoP

Pierrick Rambaud

Recent Activity

Donate For Us

What is the oldest leap year in pandas?

Tags:

python

pandas

Pierrick Rambaud

2 Answers

IMSoP

Pierrick Rambaud

Related questions

Recent Activity

Donate For Us