Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split up duration while upsampling dataframe

How do i split up a duration while upsampleing a dataframe, as in the example below. And can i replace the for loop, with e.g. the group_by function?

I want to use pandas to transform data like this:

  activity name         time started           time ended
0       Bedtime  2021-10-25 00:00:00  2021-10-25 08:25:42
1        videos  2021-10-25 08:25:42  2021-10-25 08:51:54
2       Commute  2021-10-25 08:51:54  2021-10-25 09:29:34

into this:

time started        Bedtime         videos           Commute                   
2021-10-25 00:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 01:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 02:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 03:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 04:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 05:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 06:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 07:00:00 0 days 01:00:00 0 days 00:00:00  0 days
2021-10-25 08:00:00 0 days 00:25:42 0 days 00:26:12  0 days 00:08:06
...

And i get this far:

import pandas as pd
df=pd.DataFrame({'activity name':['Bedtime','videos','Commute'],'time started':["2021-10-25 00:00:00","2021-10-25 08:25:42","2021-10-25 08:51:54"],'time ended':["2021-10-25 08:25:42","2021-10-25 08:51:54","2021-10-25 09:29:34"]})
# converting strings to datetime
df['time ended']=pd.to_datetime(df['time ended'])
df['time started']=pd.to_datetime(df['time started'])

# calclating the duration
df['duration']=df['time ended']-df['time started']

# changeing index
df.index=df['time started']
df=df.drop(columns=['time started','time ended'])

for a in df['activity name'].unique():
    df[a]=(df['activity name']==a)*df['duration']

df=df.drop(columns=['activity name','duration'])
df.resample('H').first()
time started                                               
2021-10-25 00:00:00 0 days 08:25:42 0 days 00:00:00  0 days
2021-10-25 01:00:00             NaT             NaT     NaT
2021-10-25 02:00:00             NaT             NaT     NaT
2021-10-25 03:00:00             NaT             NaT     NaT
2021-10-25 04:00:00             NaT             NaT     NaT
2021-10-25 05:00:00             NaT             NaT     NaT
2021-10-25 06:00:00             NaT             NaT     NaT
2021-10-25 07:00:00             NaT             NaT     NaT
2021-10-25 08:00:00 0 days 00:00:00 0 days 00:26:12  0 days
like image 683
Erysol Avatar asked Feb 26 '26 04:02

Erysol


1 Answers

Try this:

import pandas as pd
from io import StringIO

txtfile = StringIO(
    """  activity name         time started           time ended
0       Bedtime  2021-10-25 00:00:00  2021-10-25 08:25:42
1        videos  2021-10-25 08:25:42  2021-10-25 08:51:54
2       Commute  2021-10-25 08:51:54  2021-10-25 09:29:34"""
)

df = pd.read_csv(txtfile, sep="\s\s+", engine="python")

df[["time started", "time ended"]] = df[["time started", "time ended"]].apply(
    pd.to_datetime
)
df_e = df.assign(
    date=[
        pd.date_range(s, e, freq="s")
        for s, e in zip(df["time started"], df["time ended"])
    ]
).explode("date")

df_out = (
    df_e.groupby(["activity name", pd.Grouper(key="date", freq="H")])["activity name"]
    .count()
    .unstack(0)
    .apply(pd.to_timedelta, unit="s")
)

print(df_out)

Output:

activity name               Bedtime         Commute          videos
date                                                               
2021-10-25 00:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 01:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 02:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 03:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 04:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 05:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 06:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 07:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 08:00:00 0 days 00:25:43 0 days 00:08:06 0 days 00:26:13
2021-10-25 09:00:00             NaT 0 days 00:29:35             NaT

Address @DerekO comment:

import pandas as pd
from io import StringIO

txtfile = StringIO(
    """  activity name         time started           time ended
0       Bedtime  2021-10-25 00:00:00  2021-10-25 08:25:42
1        videos  2021-10-25 08:25:42  2021-10-25 08:51:54
2       Commute  2021-10-25 08:51:54  2021-10-25 09:29:34
3       Bedtime  2021-10-25 11:00:00  2021-10-25 13:04:31"""
)

df = pd.read_csv(txtfile, sep="\s\s+", engine="python")

df[["time started", "time ended"]] = df[["time started", "time ended"]].apply(
    pd.to_datetime
)
df_e = df.assign(
    date=[
        pd.date_range(s, e, freq="s")
        for s, e in zip(df["time started"], df["time ended"])
    ]
).explode("date")

df_out = (
    df_e.groupby(["activity name", pd.Grouper(key="date", freq="H")])["activity name"]
    .count()
    .unstack(0)
    .apply(pd.to_timedelta, unit="s")
    .sort_index()
)

print(df_out)

Output:

activity name               Bedtime         Commute          videos
date                                                               
2021-10-25 00:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 01:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 02:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 03:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 04:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 05:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 06:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 07:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 08:00:00 0 days 00:25:43 0 days 00:08:06 0 days 00:26:13
2021-10-25 09:00:00             NaT 0 days 00:29:35             NaT
2021-10-25 11:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 12:00:00 0 days 01:00:00             NaT             NaT
2021-10-25 13:00:00 0 days 00:04:32             NaT             NaT
like image 89
Scott Boston Avatar answered Feb 27 '26 18:02

Scott Boston



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!