I have the following dataframe:
DAYS7 <- c('Monday','Tuesday','Wednesday','Thursday','Friday', 'Saturday', 'Sunday')
DAYS <- rep(DAYS7,5)
A <- c(0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,0,0,0,0,0,1,1)
B <- c(0,0,0,0,0,1,2,0,0,0,0,1,2,3,0,0,0,0,1,2,3,4,5,0,0,1,2,3,0,0,0,0,0,1,2)
DESIRED_OUTPUT <- c(2,2,2,2,2,2,2,3,3,3,3,3,3,3,5,5,5,5,5,5,5,5,5,3,3,3,3,3,2,2,2,2,2,2,2)
dataframe <- as.data.frame(cbind(DAYS, A, B, DESIRED_OUTPUT))
If it's a working day, the corresponding cell in column A equals zero. If it is holiday, column A shows 1. (Note that there might be long weekends, e.g. row 12, row 19, rows 22-23!) Column B is essentially a cummulative sum of column A, with the condition that whenever a zero occurs, the cummulation resets. What I intend to do is to calculate the length of each weekend (in days) and then print the result in an additional column like this (see column DESIRED_OUTPUT):

In Excel it is relatively easy…

…but I'm struggling to find a proper solution in R. Any suggestions?
The tricky part is to define one week which is group of 0's followed by group of 1's. We can do that using rle, once that is done we can use that as a grouping variable in ave and count number of 1's in each group using sum.
with(dataframe, ave(A, with(rle(A == 0), rep(cumsum(values), lengths)), FUN = sum))
#[1] 2 2 2 2 2 2 2 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 3 3 3 3 3 2 2 2 2 2 2 2
Since B is cumulative sum of A, we can use that as well here and count the max value in group.
with(dataframe, ave(B, with(rle(A == 0), rep(cumsum(values), lengths)), FUN = max))
The dplyr way...
library(tidyverse)
DAYS7 <- c('Monday','Tuesday','Wednesday','Thursday','Friday', 'Saturday', 'Sunday')
DAYS <- rep(DAYS7,5)
A <- c(0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,0,1,1,1,1,1,0,0,1,1,1,0,0,0,0,0,1,1)
B <- c(0,0,0,0,0,1,2,0,0,0,0,1,2,3,0,0,0,0,1,2,3,4,5,0,0,1,2,3,0,0,0,0,0,1,2)
DESIRED_OUTPUT <- c(2,2,2,2,2,2,2,3,3,3,3,3,3,3,5,5,5,5,5,5,5,5,5,3,3,3,3,3,2,2,2,2,2,2,2)
dataframe <- data.frame(DAYS, A, B, DESIRED_OUTPUT)
dataframe %>%
mutate(beginningOfWeek = (B - lag(B)) < 0) %>%
mutate(beginningOfWeek = replace_na(beginningOfWeek, TRUE)) %>%
mutate(week_nr = cumsum(beginningOfWeek)) %>%
group_by(week_nr) %>%
mutate(desired_output = max(B))
DAYS A B DESIRED_OUTPUT beginningOfWeek week_nr desired_output
<fct> <dbl> <dbl> <dbl> <lgl> <int> <dbl>
1 Monday 0 0 2 TRUE 1 2
2 Tuesday 0 0 2 FALSE 1 2
3 Wednesday 0 0 2 FALSE 1 2
4 Thursday 0 0 2 FALSE 1 2
5 Friday 0 0 2 FALSE 1 2
6 Saturday 1 1 2 FALSE 1 2
7 Sunday 1 2 2 FALSE 1 2
8 Monday 0 0 3 TRUE 2 3
9 Tuesday 0 0 3 FALSE 2 3
10 Wednesday 0 0 3 FALSE 2 3
11 Thursday 0 0 3 FALSE 2 3
12 Friday 1 1 3 FALSE 2 3
13 Saturday 1 2 3 FALSE 2 3
14 Sunday 1 3 3 FALSE 2 3
15 Monday 0 0 5 TRUE 3 5
16 Tuesday 0 0 5 FALSE 3 5
17 Wednesday 0 0 5 FALSE 3 5
18 Thursday 0 0 5 FALSE 3 5
19 Friday 1 1 5 FALSE 3 5
20 Saturday 1 2 5 FALSE 3 5
21 Sunday 1 3 5 FALSE 3 5
22 Monday 1 4 5 FALSE 3 5
23 Tuesday 1 5 5 FALSE 3 5
24 Wednesday 0 0 3 TRUE 4 3
25 Thursday 0 0 3 FALSE 4 3
26 Friday 1 1 3 FALSE 4 3
27 Saturday 1 2 3 FALSE 4 3
28 Sunday 1 3 3 FALSE 4 3
29 Monday 0 0 2 TRUE 5 2
30 Tuesday 0 0 2 FALSE 5 2
31 Wednesday 0 0 2 FALSE 5 2
32 Thursday 0 0 2 FALSE 5 2
33 Friday 0 0 2 FALSE 5 2
34 Saturday 1 1 2 FALSE 5 2
35 Sunday 1 2 2 FALSE 5 2
So, I didn't remove the intermediate columns, so that it's clearer, what they are actually for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With