Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtracting rows based on conditions in other columns

Tags:

r

dplyr

I am working with a dataframe which is similar to this:

df1 <- data.frame(p1 = c("John", "John", "John", "John", "John", "John", "Jim", "Jim", "Jim", "Jim", "Jim", "Jim", "Jim","Jim" ),
           elapsed_time = c(0, 4, 6, 9, 12, 14, 17, 22, 27, 35, 42, 47, 51, 57),
           event_type = c("start of period", "play", "play", "play", "play", "play", "play", "play", "play", "timeout", "play", "play", "play", "play"))

and looks like this:

 p1 elapsed_time      event_type
1  John            0 start of period
2  John            4            play
3  John            6            play
4  John            9            play
5  John           12            play
6  John           14            play
7   Jim           17            play
8   Jim           22            play
9   Jim           27            play
10  Jim           35         timeout
11  Jim           42            play
12  Jim           47            play
13  Jim           51            play
14  Jim           57            play

What I'd like to do is add a 4th column that calculates elapsed time since 1 of 3 things happened: 1) event_type == "start of period" 2) eventtype == "timeout" 3) p1 was changed (like in row 7 from John to Jim). Any of these three things should reset the 4th column to zero.

My desired output is

 p1 elapsed_time      event_type    elapsed_time_since_last_break
1  John            0 start of period                             0
2  John            4            play                             4
3  John            6            play                             6
4  John            9            play                             9
5  John           12            play                            12
6  John           14            play                            14
7   Jim           17            play                             0
8   Jim           22            play                             5
9   Jim           27            play                            10
10  Jim           35         timeout                             0
11  Jim           42            play                             7
12  Jim           47            play                            12
13  Jim           51            play                            16
14  Jim           57            play                            22

I'm somewhat new to r and haven't had much success. I'm sure there's probably a simple solution I'm overlooking.

like image 521
Matt Avatar asked Dec 01 '25 10:12

Matt


1 Answers

df1 %>%
  group_by(p1, elps = cumsum(event_type != 'play'))%>%
  mutate(elps = elapsed_time - elapsed_time[1])

# A tibble: 14 × 4
# Groups:   p1, elps [13]
   p1    elapsed_time event_type       elps
   <chr>        <dbl> <chr>           <dbl>
 1 John             0 start of period     0
 2 John             4 play                4
 3 John             6 play                6
 4 John             9 play                9
 5 John            12 play               12
 6 John            14 play               14
 7 Jim             17 play                0
 8 Jim             22 play                5
 9 Jim             27 play               10
10 Jim             35 timeout             0
11 Jim             42 play                7
12 Jim             47 play               12
13 Jim             51 play               16
14 Jim             57 play               22
like image 187
KU99 Avatar answered Dec 03 '25 03:12

KU99



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!