I have a list of dates of when people went into a room and left it:
05/04/2017 14:20    05/04/2017 17:54
05/04/2017 13:10    06/04/2017 07:56
05/04/2017 10:30    05/04/2017 11:04
So a person entered at 14:20 and left at 17:54. A person entered at 13:10 one day and left at 07:56 the next.
What I would like to do is look at how many people were in the room between certain hours of the day, e.g. there were two people in the room between 14:00 and 15:00. I would then like to graph this data so I can see number of people in the room over different time periods.
My question is, is there a name for this kind of analysis and is this something that a package like Pandas can do. I can write an algorithm to do this (probably) but before doing that I wanted to check if it is a 'known problem'.
Problems of this sort appear in different applications (e.g in physics it is called mass balance), but AFAIK have no common name. But their essence is simple counting, so it is easier to write an algorithm than to find a solution to exactly your problem :)
This code calculates number of people that have entered or exited the room up to a given time, and then just subtracts the first from the second:
import pandas as pd
data = pd.DataFrame({'in':[10, 11, 11, 12, 14], 'out':[11, 13, 15, 14, 15]})
count_in = data.groupby('in')['in'].count()
count_out = data.groupby('out')['out'].count()
count_data = pd.concat([count_in, count_out], axis=1).fillna(0).cumsum()
print(count_data['in'] - count_data['out'])
The code gives a result of:
10    1.0
11    2.0
12    3.0
13    2.0
14    2.0
15    0.0
It means that at 10 there was one person (who had just come), at 11 there were two (2 more came but 1 exited), etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With