I have the following code, I am calculating the percent growth in the data points and then I calculate the change in the percent growth, what I am looking for is to be able to add a column where I count the number of readings where the percent growth change is negative
df <- data.frame(id = c(1,2,3,4,5,6,7,8,9,10,11,12), data = c(19, 19, 27, 27, 38, 42, 47, 48, 49, 50, 51, 53))
df <- mutate(df, pct_growth = (data - lag(data))/lag(data))
df <- mutate(df, pct_growth_change = pct_growth - lag(pct_growth))
df$pct_growth_streak <- 0
df <- dplyr::mutate(df, pct_growth_streak = ifelse(pct_growth_change <=0, lag(pct_growth_streak)+1,0) )
What I am getting as the output is the following
   id data pct_growth pct_growth_change pct_growth_streak
1   1   19         NA                NA                NA
2   2   19 0.00000000                NA                NA
3   3   27 0.42105263      0.4210526316                 0
4   4   27 0.00000000     -0.4210526316                 1
5   5   38 0.40740741      0.4074074074                 0
6   6   42 0.10526316     -0.3021442495                 1
7   7   47 0.11904762      0.0137844612                 0
8   8   48 0.02127660     -0.0977710233                 1
9   9   49 0.02083333     -0.0004432624                 1
10 10   50 0.02040816     -0.0004251701                 1
11 11   51 0.02000000     -0.0004081633                 1
12 12   53 0.03921569      0.0192156863                 0
And what I need is
   id data pct_growth pct_growth_change pct_growth_streak
1   1   19         NA                NA                NA
2   2   19 0.00000000                NA                NA
3   3   27 0.42105263      0.4210526316                 0
4   4   27 0.00000000     -0.4210526316                 1
5   5   38 0.40740741      0.4074074074                 0
6   6   42 0.10526316     -0.3021442495                 1
7   7   47 0.11904762      0.0137844612                 0
8   8   48 0.02127660     -0.0977710233                 1
9   9   49 0.02083333     -0.0004432624                 2
10 10   50 0.02040816     -0.0004251701                 3
11 11   51 0.02000000     -0.0004081633                 4
12 12   53 0.03921569      0.0192156863                 0
lag lag shifts the times one back. It does not change the values, only the times. Thus lag changes the tsp attribute from c(1, 4, 1) to c(0, 3, 1) . The start time is shifted from 1 to 0, the end time is shifted from 4 to 3 and since shifts do not change the frequency the frequency remains 1.
The opposite of lag() function is lead()
We can use rleid to create groups of consecutive streaks and calculate cumsum over it.  
library(data.table)
setDT(df)[, pct_growth_streak := cumsum(pct_growth_streak), 
            rleid(pct_growth_streak)]
df
#    id data pct_growth pct_growth_change pct_growth_streak
# 1:  1   19         NA                NA                NA
# 2:  2   19 0.00000000                NA                NA
# 3:  3   27 0.42105263      0.4210526316                 0
# 4:  4   27 0.00000000     -0.4210526316                 1
# 5:  5   38 0.40740741      0.4074074074                 0
# 6:  6   42 0.10526316     -0.3021442495                 1
# 7:  7   47 0.11904762      0.0137844612                 0
# 8:  8   48 0.02127660     -0.0977710233                 1
# 9:  9   49 0.02083333     -0.0004432624                 2
#10: 10   50 0.02040816     -0.0004251701                 3
#11: 11   51 0.02000000     -0.0004081633                 4
#12: 12   53 0.03921569      0.0192156863                 0
We can use it dplyr too : 
library(dplyr)
df %>%
   group_by(grp = rleid(pct_growth_streak)) %>%
   mutate(pct_growth_streak = cumsum(pct_growth_streak))
Or with ave : 
with(df, ave(pct_growth_streak, rleid(pct_growth_streak), FUN = cumsum))
One approach: first define a grouping variable sgrp that increments with each sign change of pct_growth_change:
df %<>% mutate(sgrp = cumsum(if_else(sign(pct_growth_change) == 
                               sign(lag(pct_growth_change, 1)), 0, 1, 1)))
Then group by sgrp and set pct_growth_streak as the row number within the group if pct_growth_change is negative.
df %>% 
group_by(sgrp) %>% 
mutate(pct_growth_streak = 
    (pct_growth_change < 0) * row_number()
) %>%
ungroup() %>%
select(-sgrp);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With