Pandas upsampling does not include the 23 hours of last day in year

Question

I have a time series dataframe with dates|weather information that looks like this:

enter image description here

2017-01-01 5
2017-01-02 10
.
.
2017-12-31 6

I am trying to upsample it to hourly data using the following: weather.resample('H').pad()

I expected to see 8760 entries for 24 intervals * 365 days. However, it only returns 8737 with the last 23 intervals missing for 31st of december. Is there something special I need to do to get 24 intervals for the last day?

Thanks in advance.

RichieV · Accepted Answer

Pandas normalizes 2017-12-31 to 2017-12-31 00:00 and then creates a range that ends in that last datetime... I would include a last row before resampling with

df.loc['2018-01-01'] = 0

Edit: You can get the result you want with numpy.repeat

Take this df

np.random.seed(1)
weather = pd.DataFrame(index=pd.date_range('2017-01-01', '2017-12-31'),
    data={'WEATHER_MAX': np.random.random(365)*15})

            WEATHER_MAX
2017-01-01     6.255330
2017-01-02    10.804867
2017-01-03     0.001716
2017-01-04     4.534989
2017-01-05     2.201338
...                 ...
2017-12-27     4.503725
2017-12-28     2.145087
2017-12-29    13.519627
2017-12-30     8.123391
2017-12-31    14.621106

[365 rows x 1 columns]

By repeating on axis=1 you can then transform the default range(24) column names to hourly timediffs

# repeat, then stack
hourly = pd.DataFrame(np.repeat(weather.values, 24, axis=1),
    index=weather.index).stack()

# combine date and hour
hourly.index = (
    hourly.index.get_level_values(0) +
    pd.to_timedelta(hourly.index.get_level_values(1), unit='h')
)
hourly = hourly.rename('WEATHER_MAX').to_frame()

Output

                     WEATHER_MAX
2017-01-01 00:00:00     6.255330
2017-01-01 01:00:00     6.255330
2017-01-01 02:00:00     6.255330
2017-01-01 03:00:00     6.255330
2017-01-01 04:00:00     6.255330
...                          ...
2017-12-31 19:00:00    14.621106
2017-12-31 20:00:00    14.621106
2017-12-31 21:00:00    14.621106
2017-12-31 22:00:00    14.621106
2017-12-31 23:00:00    14.621106

[8760 rows x 1 columns]

Pandas upsampling does not include the 23 hours of last day in year

Tags:

python

pandas

google-colaboratory

pandas-resample

Gopakumar G

1 Answers

RichieV

Recent Activity

Donate For Us

Pandas upsampling does not include the 23 hours of last day in year

Tags:

python

pandas

google-colaboratory

pandas-resample

Gopakumar G

1 Answers

RichieV

Related questions

Recent Activity

Donate For Us