I have the below data frame which is sorted by user and timestamp (written as an integer here to make it easier).
I've added a column which gives the timedifference from the previous activity in minutes using pandas diff(). I'm defining actions as belonging to the same session if they happen within 30 minutes of each other. Finding new sessions is easy then, as I can just look at if timediff is equal to 'NaT' or greater than 30.
d = {'id': [123, 123, 123, 123, 123, 123, 234, 234],
'activity': ['view','click','click','view','click','view', 'click', 'view'],
'timestamp': [1, 2,3,4,5,6,1,2],
'timediff_min': ['NaT',1,36,2,6,124,'NaT',1],
'new_session': [1,0,1,0,0,1,1,0]}
df = pd.DataFrame(d)
df
This yields, the 'new_session' column. Now I can filter down to get a dataframe with the timestamp of session starts, but I would like to get the timestamp of the final activity to be able to calculate session length. So basically, if there is a single activity session start and session end time will be the same, but if there is more than one in the same session, session start will be the first activity, and session end will be the final activity before the next session starts. So the final output would be something like this
d2 = {'id': [123, 123, 123, 234, ],
'activity': ['view','click','view', 'click'] ,
'timestamp': [1, 3,6,1],
'timediff_min': ['NaT',36,124,'NaT'],
'new_session': [1,1,1,1,],
'session_start': [1,3,6,1],
'session_end': [2,5,6,2],}
pd.DataFrame(d2)
Any help would be appreciated. Thanks!
I solved this by using the following approach
d['time_diff'] = d.groupby('id')['timestamp'].diff()
d['new_sess'] = np.where((d.time_diff.isnull()) | (d.time_diff > 'P0DT0H30M0S'), 'yes', 'no')
new_sessions = np.where((d.time_diff.isnull()) | (d.time_diff > 'P0DT0H30M0S'))
d['sess_count'] = np.NaN
d.iloc[new_sessions[0],9] = new_sessions[0]
d.fillna(method='ffill', inplace = True)
d['sess_id'] = d.id + '-' + d.sess_count.astype(int).astype(str)
This creates unique session ids, that I can then group to get min and max timestamps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With