I am looking for a way to, within id and groups, create an index on 100 using the lag (or is it lead) of  value and the new index number idx_value to calculate the next index number.
# install.packages(c("tidyverse"), dependencies = TRUE)
library(tibble)
library(magrittr)
Like, I have this dataframe:
start_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, 
-4, 2, 1, -3, 2, -1, -4, -2)), row.names = c(NA, -14L), class = c("tbl_df", 
"tbl", "data.frame"))
start_tbl
# A tibble: 14 x 4
      id   grp  year value
   <int> <int> <int> <dbl>
 1     1     1     7   2  
 2     1     1     8  -7  
 3     1     1     9  -2.3
 4     1     1    10   1.1
 5     1     2     7  -1  
 6     1     2     8 -12  
 7     1     2     9  -4  
 8     1     2    10   2  
 9     2     1     7   1  
10     2     1     8  -3  
11     2     1     9   2  
12     2     2     7  -1  
13     2     2     8  -4  
14     2     2     9  -2  
now I want to take id 1 grp 1 and make the index, then calculate id 1 grp 1 year 7 as 100*(1+-7/100) = 93.0, next use that result, 93, to calculate the next year: 93*(1+-2.3/100)= 90.861, and so forth. Restarting on all index years, which is a new id and a new grp and base year 7.
I am quite close with:
tbl %>% group_by(id) %>% mutate(idx_value = value-lag(value), idx_value = 100*(1+value/100) )
# A tibble: 14 x 5
# Groups:   id [2]
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <dbl>
 1     1     1     7   2       102  
 2     1     1     8  -7        93  
 3     1     1     9  -2.3      97.7
 4     1     1    10   1.1     101. 
 5     1     2     7  -1        99  
 6     1     2     8 -12        88  
 7     1     2     9  -4        96  
 8     1     2    10   2       102  
 9     2     1     7   1       101  
10     2     1     8  -3        97  
11     2     1     9   2       102  
12     2     2     7  -1        99  
13     2     2     8  -4        96  
14     2     2     9  -2        98  
but what I am trying to get to is:
end_tbl <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L), grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L), year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L, 
7L, 8L, 9L, 7L, 8L, 9L), value = c(2, -7, -2.3, 1.1, -1, -12, 
-4, 2, 1, -3, 2, -1, -4, -2), idx_value = c(100L, 93L, 91L, 92L, 
100L, 88L, 84L, 86L, 100L, 97L, 99L, 100L, 96L, 94L)), row.names = c(NA, 
-14L), class = c("tbl_df", "tbl", "data.frame"))
end_tbl
# A tibble: 14 x 5
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <int>
 1     1     1     7   2         100
 2     1     1     8  -7          93
 3     1     1     9  -2.3        91
 4     1     1    10   1.1        92
 5     1     2     7  -1         100
 6     1     2     8 -12          88
 7     1     2     9  -4          84
 8     1     2    10   2          86
 9     2     1     7   1         100
10     2     1     8  -3          97
11     2     1     9   2          99
12     2     2     7  -1         100
13     2     2     8  -4          96
14     2     2     9  -2          94
Any help with be appreciated. Maybe the answer is here.
start_tbl2 to illustrate the issue. If I use a starting tibble like start_tbl2 below    start_tbl2 <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
grp = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
year = c(7L, 8L, 9L, 10L, 7L, 8L, 9L, 10L), 
value = c(2, -12, -18.3, 100, 15, 30, 40, -50)), 
row.names = c(NA, -8L), class = c("tbl_df", "tbl", "data.frame"))
library(dplyr)
start_tbl2 %>%
   group_by(id, grp) %>% 
   mutate(idx_value = c(100, round(100 * (1 + cumsum(value[-1])/100))))
# A tibble: 8 x 5
# Groups:   id, grp [2]
     id   grp  year value idx_value
  <int> <int> <int> <dbl>     <dbl>
1     1     1     7   2         100
2     1     1     8 -12          88
3     1     1     9 -18.3        70
4     1     1    10 100         170
5     1     2     7  15         100
6     1     2     8  30         130
7     1     2     9  40         170
8     1     2    10 -50         120
Whereas I get this when I calculate it by hand:
Percentage_change   cal_by_hand cumsum  diff
2                   100         100     0
-12                 88          88      0
-18.3               71.896      70      1.896
100                 143.792     170     -26.208
15                  100         100     0
30                  130         130     0
40                  182         170     12
-50                 91          120     -29
To calculate the Price Index, take the price of the Market Basket of the year of interest and divide by the price of the Market Basket of the base year, then multiply by 100.
How do I calculate percentage change? If you are tracking a particular stock's price increase, use the formula (New Price - Old Price)/Old Price and then multiply that number by 100. If the price decreased, use the formula (Old Price - New Price)/Old Price and multiply that number by 100.
The index is calculated by adding the stock prices of the 30 companies and then dividing by the divisor. The divisor changes when there are stock splits or dividends or when a company is added or removed from the index.
Another way would be to use cumprod() after converting the values to percentages:
library(dplyr)
start_tbl %>%
  group_by(id, grp) %>%
  mutate(idx_value = cumprod(c(100, (100 + value[-1]) / 100))) 
# A tibble: 14 x 5
# Groups:   id, grp [4]
      id   grp  year value idx_value
   <int> <int> <int> <dbl>     <dbl>
 1     1     1     7   2       100  
 2     1     1     8  -7        93  
 3     1     1     9  -2.3      90.9
 4     1     1    10   1.1      91.9
 5     1     2     7  -1       100  
 6     1     2     8 -12        88  
 7     1     2     9  -4        84.5
 8     1     2    10   2        86.2
 9     2     1     7   1       100  
10     2     1     8  -3        97  
11     2     1     9   2        98.9
12     2     2     7  -1       100  
13     2     2     8  -4        96  
14     2     2     9  -2        94.1
Based on the new dataset
library(purrr)
library(dplyr)
start_tbl2 %>%
      group_by(id, grp) %>%
      mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 8 x 5
# Groups:   id, grp [2]
#     id   grp  year value idx_vlue
#  <int> <int> <int> <dbl>    <dbl>
#1     1     1     7   2      100  
#2     1     1     8 -12       88  
#3     1     1     9 -18.3     71.9
#4     1     1    10 100      144. 
#5     1     2     7  15      100  
#6     1     2     8  30      130  
#7     1     2     9  40      182  
#8     1     2    10 -50       91  
and using 'start_tbl
start_tbl %>%
     group_by(id, grp) %>%
     mutate(idx_vlue = accumulate(value[-1], ~ .x * (1 + .y/100), .init = 100 ))
# A tibble: 14 x 5
# Groups:   id, grp [4]
#      id   grp  year value idx_vlue
#   <int> <int> <int> <dbl>    <dbl>
# 1     1     1     7   2      100  
# 2     1     1     8  -7       93  
# 3     1     1     9  -2.3     90.9
# 4     1     1    10   1.1     91.9
# 5     1     2     7  -1      100  
# 6     1     2     8 -12       88  
# 7     1     2     9  -4       84.5
# 8     1     2    10   2       86.2
# 9     2     1     7   1      100  
#10     2     1     8  -3       97  
#11     2     1     9   2       98.9
#12     2     2     7  -1      100  
#13     2     2     8  -4       96  
#14     2     2     9  -2       94.1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With