Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

convert multiple columns to datetime without the date in pandas

I have a dataframe with 3 columns, one for hour, one for minute, and one for second, like this:

df = pd.DataFrame({'hour': [9.0, 9.0, 9.0, 10.0],
                   'min': [12.0, 13.0, 55.0, 2.0],
                   'sec': [42.0, 30.0, 12.0, 5.0]})

>>> df
   hour   min   sec
0   9.0  12.0  42.0
1   9.0  13.0  30.0
2   9.0  55.0  12.0
3  10.0   2.0   5.0

I'm trying to combine the three columns into a new column made up of a datetime series. The goal would be to have this dataframe:

   hour   min   sec      time
0   9.0  12.0  42.0   9:12:42
1   9.0  13.0  30.0   9:13:30
2   9.0  55.0  12.0   9:55:12
3  10.0   2.0   5.0  10:02:05

So far I'm trying to use pd.to_datetime, as such:

df['time'] = pd.to_datetime(df[['hour', 'min', 'sec']],
                        format = '%H:%M:S')

But I'm getting the following ValueError: ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing.

I was trying to avoid this by including the format argument with only hour minute second, but apparently that doesn't work.

A similar question was asked here, but the solutions proposed do not seem to work in this case, I'm still getting this ValueError

Any ideas to solve this would be appreciated!

Thanks!

[EDIT]: I also needed the method to be able to deal with NaNs, so a dataframe such as this:

df = pd.DataFrame({'hour': [9.0, 9.0, 9.0, 10.0, np.nan],
                   'min': [12.0, 13.0, 55.0, 2.0, np.nan],
                   'sec': [42.0, 30.0, 12.0, 5.0, np.nan]})

The solution proposed by @PiRSquared works

like image 267
sacuL Avatar asked Dec 10 '25 16:12

sacuL


2 Answers

Not sure if there is a more direct way but this works

df['time'] = pd.to_datetime(df['hour'].astype(int).astype(str)+':'+df['min'].astype(int).astype(str)+':'+df['sec'].astype(int).astype(str), format = '%H:%M:%S').dt.time


    hour    min     sec     time
0   9.0     12.0    42.0    09:12:42
1   9.0     13.0    30.0    09:13:30
2   9.0     55.0    12.0    09:55:12
3   10.0    2.0     5.0     10:02:05
like image 183
Vaishali Avatar answered Dec 13 '25 06:12

Vaishali


We can use pd.to_datetime on a dataframe with the requisite column names to create a series of datetimes.

However, OPs initial dataframe has a 'min' column that needs to be renamed 'minute' and a 'sec' column that needs to be renamed 'second'.

In addition, I'll add the missing columns 'year', 'month', and 'day' using pd.DataFrame.assign.

Finally, I'll add the 'time' column with pd.DataFrame.assign again.

new = dict(year=2017, month=1, day=1)
rnm = dict(min='minute', sec='second')
df.assign(
    time=pd.to_datetime(
        df.rename(columns=rnm).assign(**new)
    ).dt.time
)

   hour   min   sec      time
0   9.0  12.0  42.0  09:12:42
1   9.0  13.0  30.0  09:13:30
2   9.0  55.0  12.0  09:55:12
3  10.0   2.0   5.0  10:02:05
like image 41
piRSquared Avatar answered Dec 13 '25 06:12

piRSquared