I need to move back to the beginning of the month but if i'm already at the beginning I want to stay there. Pandas anchored offset with n=0 is supposed to do exactly that but it doesn't produce the expected results between the anchored points for the (-) MonthBegin .
For example for this
pd.Timestamp('2017-01-06 00:00:00') - pd.tseries.offsets.MonthBegin(n=0)
I expect to move me back to Timestamp('2017-01-01 00:00:00')
but instead I get Timestamp('2017-02-01 00:00:00')
What am I doing wrong? Or you think it's a bug?
I can also see that the same rule works fine for the MonthEnd so combining the 2 like below pd.Timestamp('2017-01-06 00:00:00')+pd.tseries.offsets.MonthEnd(n=0)-pd.tseries.offsets.MonthBegin(n=1)
I get the desired effect of Timestamp('2017-01-01 00:00:00')
but my expectation for it to work with just - pd.tseries.offsets.MonthBegin(n=0)
DateOffsets can be created to move dates forward a given number of valid dates. For example, Bday(2) can be added to a date to move it two business days forward. If the date does not start on a valid date, first it is moved to a valid date and then offset is created.
The DateTimeOffset structure represents a date and time value, together with an offset that indicates how much that value differs from UTC. Thus, the value always unambiguously identifies a single point in time.
In pandas, a string is converted to a datetime object using the pd. to_datetime() method and pd. DateOffset() method is used to add months to the created pandas object.
To jump to the month's start, use:
ts + pd.tseries.offsets.MonthEnd(n=0) - pd.tseries.offsets.MonthBegin(n=1)
Yes, it's ugly, but it's the only method to jump to the first of the month while staying there if ts
is already the first.
Quick demo:
>>> pd.date_range(dt.datetime(2016,12,30), dt.datetime(2017,2,2)).to_series() \
+ MonthEnd(n=0) - MonthBegin(n=1)
2016-12-30 2016-12-01
2016-12-31 2016-12-01
2017-01-01 2017-01-01
2017-01-02 2017-01-01
...
2017-01-31 2017-01-01
2017-02-01 2017-02-01
2017-02-02 2017-02-01
This is indeed the correct behavior that is witnessed which are part of the rules laid out in Anchored Offset Semantics for offsets supporting start/end of a particular frequency offset.
Consider the given example:
from pandas.tseries.offsets import MonthBegin
pd.Timestamp('2017-01-02 00:00:00') - MonthBegin(n=0)
Out[18]:
Timestamp('2017-02-01 00:00:00')
Note that the anchor point corresponding to MonthBegin
offset is set as first of every month. Now, since the given timestamp clearly surpasses this day, these would automatically be treated as though it were a part of the next month and rolling (whether forward or backwards) would come into play only after that.
excerpt from docs
For the case when n=0, the date is not moved if on an anchor point, otherwise it is rolled forward to the next anchor point.
To get what you're after, you need to provide n=1
which would roll the timestamp to the correct date.
pd.Timestamp('2017-01-02 00:00:00') - MonthBegin(n=1)
Out[20]:
Timestamp('2017-01-01 00:00:00')
If you had set the date on the exact anchor point, then also it would give you the desired result as per the attached docs.
pd.Timestamp('2017-01-01 00:00:00') - MonthBegin(n=0)
Out[21]:
Timestamp('2017-01-01 00:00:00')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With