Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError: time data '10/11/2006 24:00' does not match format '%d/%m/%Y %H:%M'

I tried:

df["datetime_obj"] = df["datetime"].apply(lambda dt: datetime.strptime(dt, "%d/%m/%Y %H:%M"))

but got this error:

ValueError: time data '10/11/2006 24:00' does not match format '%d/%m/%Y %H:%M'

How to solve it correctly?

like image 903
ScalaBoy Avatar asked Feb 03 '26 17:02

ScalaBoy


2 Answers

The reason why this does not work is because the %H parameter only accepts values in the range of 00 to 23 (both inclusive). This thus means that 24:00 is - like the error says - not a valid time string.

I think therefore we have not much other options than convert the string to a valid format. We can do this by first replacing 24:00 with 00:00, and then later increment the day for these timestamps.

Like:

from datetime import timedelta
import pandas as pd

df['datetime_zero'] = df['datetime'].str.replace('24:00', '0:00')
df['datetime_er'] = pd.to_datetime(df['datetime_zero'], format='%d/%m/%Y %H:%M')
selrow = df['datetime'].str.contains('24:00')
df['datetime_obj'] = df['datetime_er'] + selrow * timedelta(days=1)

The last line thus adds one day to the rows that contain 24:00, such that '10/11/2006 24:00' gets converted to '11/11/2006 24:00'. Note however that the above is rather unsafe since depending on the format of the timestamp this will/will not work. For the above it will (probably) work, since there is only one colon. But if for example the datetimes have seconds as well, the filter could get triggered for 00:24:00, so it might require some extra work to get it working.

like image 138
Willem Van Onsem Avatar answered Feb 06 '26 06:02

Willem Van Onsem


Your data doesn't follow the conventions used by Python / Pandas datetime objects. There should be only one way of storing a particular datetime, i.e. '10/11/2006 24:00' should be rewritten as '11/11/2006 00:00'.

Here's one way to approach the problem:

# find datetimes which have '24:00' and rewrite
twenty_fours = df['strings'].str[-5:] == '24:00'
df.loc[twenty_fours, 'strings'] = df['strings'].str[:-5] + '00:00'

# construct datetime series
df['datetime'] = pd.to_datetime(df['strings'], format='%d/%m/%Y %H:%M')

# add one day where applicable
df.loc[twenty_fours, 'datetime'] += pd.DateOffset(1)

Here's some data to test:

dateList = ['10/11/2006 24:00', '11/11/2006 00:00', '12/11/2006 15:00']
df = pd.DataFrame({'strings': dateList})

Result after transformations described above:

print(df['datetime'])

0   2006-11-11 00:00:00
1   2006-11-11 00:00:00
2   2006-11-12 15:00:00
Name: datetime, dtype: datetime64[ns]
like image 22
jpp Avatar answered Feb 06 '26 07:02

jpp