I have a data set that describes the number of days (BedDays) each patient who stays in a hospital:
my_df<-data.frame(RecordID=c("1","2","3","4","5","6","7","8","9","10"),
StartDate=c("2020-03-02","2020-03-05"
,"2020-03-08","2020-03-11"
,"2020-03-14","2020-03-17"
,"2020-03-20","2020-03-23"
,"2020-03-26","2020-03-29"),
BedDays=c(12,3,2,11,1,10,18,10,3,6))
And would like to count the number of beds being used each day. I would like my results to be:

So far, I can only think of the use of arrays and loop. Use array to list the dates that each patient that stays in hospital, then use loop to look through the arrays to count the patients in the hospital on a certain day. This will be the count of beds being used on a certain day. This sounds complicated.
Is there any clever functions/packages that can do this transformation easily?
I think the key is to "expand" your BedDays to a sequence of days, then count those days.
library(tidyverse)
my_df %>%
mutate(StartDate = as.Date(StartDate)) %>%
reframe(duration = seq(StartDate, StartDate + BedDays - 1, by = 1), .by = RecordID) %>%
count(duration)
duration n
1 2020-03-02 1
2 2020-03-03 1
3 2020-03-04 1
4 2020-03-05 2
5 2020-03-06 2
6 2020-03-07 2
7 2020-03-08 2
8 2020-03-09 2
9 2020-03-10 1
10 2020-03-11 2
11 2020-03-12 2
12 2020-03-13 2
13 2020-03-14 2
14 2020-03-15 1
15 2020-03-16 1
16 2020-03-17 2
17 2020-03-18 2
18 2020-03-19 2
19 2020-03-20 3
20 2020-03-21 3
21 2020-03-22 2
22 2020-03-23 3
23 2020-03-24 3
24 2020-03-25 3
25 2020-03-26 4
26 2020-03-27 3
27 2020-03-28 3
28 2020-03-29 3
29 2020-03-30 3
30 2020-03-31 3
31 2020-04-01 3
32 2020-04-02 2
33 2020-04-03 2
34 2020-04-04 1
35 2020-04-05 1
36 2020-04-06 1
The code below starts by splitting the data by date, then it creates data.frames with as many rows as BedDays, puts all those temp df's together and aggregates the data to have counts of beds by date.
my_df<-data.frame(RecordID=c("1","2","3","4","5","6","7","8","9","10"),
StartDate=c("2020-03-02","2020-03-05"
,"2020-03-08","2020-03-11"
,"2020-03-14","2020-03-17"
,"2020-03-20","2020-03-23"
,"2020-03-26","2020-03-29"),
BedDays=c(12,3,2,11,1,10,18,10,3,6))
my_df$StartDate <- as.Date(my_df$StartDate)
my_df |>
split(my_df$StartDate) |>
lapply(\(x) {
StartDate <- x$StartDate + seq.int(x$BedDays) - 1L
data.frame(StartDate, NumBeds = 1L)
}) |>
do.call(rbind, args = _) |>
aggregate(NumBeds ~ StartDate, data = _, sum)
#> StartDate NumBeds
#> 1 2020-03-02 1
#> 2 2020-03-03 1
#> 3 2020-03-04 1
#> 4 2020-03-05 2
#> 5 2020-03-06 2
#> 6 2020-03-07 2
#> 7 2020-03-08 2
#> 8 2020-03-09 2
#> 9 2020-03-10 1
#> 10 2020-03-11 2
#> 11 2020-03-12 2
#> 12 2020-03-13 2
#> 13 2020-03-14 2
#> 14 2020-03-15 1
#> 15 2020-03-16 1
#> 16 2020-03-17 2
#> 17 2020-03-18 2
#> 18 2020-03-19 2
#> 19 2020-03-20 3
#> 20 2020-03-21 3
#> 21 2020-03-22 2
#> 22 2020-03-23 3
#> 23 2020-03-24 3
#> 24 2020-03-25 3
#> 25 2020-03-26 4
#> 26 2020-03-27 3
#> 27 2020-03-28 3
#> 28 2020-03-29 3
#> 29 2020-03-30 3
#> 30 2020-03-31 3
#> 31 2020-04-01 3
#> 32 2020-04-02 2
#> 33 2020-04-03 2
#> 34 2020-04-04 1
#> 35 2020-04-05 1
#> 36 2020-04-06 1
Created on 2024-01-02 with reprex v2.0.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With