I am able to create quarterly and monthly PeriodIndex like so:
idx = pd.PeriodIndex(year=[2000, 2001], quarter=[1,2], freq="Q") # quarterly
idx = pd.PeriodIndex(year=[2000, 2001], month=[1,2], freq="M") # monthly
I would expect to be able to create a yearly PeriodIndex like so:
idx = pd.PeriodIndex(year=[2000, 2001], freq="Y")
Instead this throws the following error:
Traceback (most recent call last):
File ".../script.py", line 3, in <module>
idx = pd.PeriodIndex(year=[2000, 2001], freq="Y")
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/period.py", line 250, in __new__
data, freq2 = PeriodArray._generate_range(None, None, None, freq, fields)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/arrays/period.py", line 316, in _generate_range
subarr, freq = _range_from_fields(freq=freq, **fields)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/arrays/period.py", line 1160, in _range_from_fields
ordinals.append(libperiod.period_ordinal(y, mth, d, h, mn, s, 0, 0, base))
File "pandas/_libs/tslibs/period.pyx", line 1109, in pandas._libs.tslibs.period.period_ordinal
TypeError: an integer is required
It seems like something that should be very easy to do but yet I cannot understand what is going wrong. Can anybody help?
month and year are both required "fields" due to the current implementation (through pandas 1.5.1 at least). Most other field values will be configured with a default value, however, neither month or year will be defined if a value is not provided. Therefore, in this case, month will remain None which causes the error shown
TypeError: an integer is required
Here is a link to the relevant section of the source code where default values are defined. Omitting the month field results in [None, None] (in this case) which cannot be converted to a Periodindex.
A correct index can be built as follows.
idx = pd.PeriodIndex(year=[2000, 2001], month=[1, 1], freq='Y')
Resulting in:
PeriodIndex(['2000', '2001'], dtype='period[A-DEC]')
Depending on the number of years, it may also make sense to programmatically generate the list of months:
years = [2000, 2001]
idx = pd.PeriodIndex(year=years, month=[1] * len(years), freq='Y')
As an alternative, it may be easier to use to_datetime + to_period to create the Period index from a Datetime index instead (as it is already in a compatible form)
pd.to_datetime([2000, 2001], format='%Y').to_period('Y')
Resulting in the same PeriodIndex:
PeriodIndex(['2000', '2001'], dtype='period[A-DEC]')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With