Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating percentage change from a fixed year for different group

Tags:

r

dplyr

I would like to calculate the percentage change of my variable "var2" for different cities over time relative to the year "2000" ?

I tried this:

library(dplyr)
data <- data.frame(cities= c('NY','NY','NY','NY','NY','PL','PL', 'PL','PL','PL','AS','AS','AS','AS','AS','RY','RY','RY','RY','RY', 'JK', 'JK', 'JK', 'JK', 'JK'), year=c('2000', '2002', '2004', '2006', '2008', '2000', '2002', '2004', '2006', '2008','2000', '2002', '2004', '2006', '2008','2000', '2002', '2004', '2006', '2008','2000', '2002', '2004', '2006', '2008'), 
                    var2 = c(12,26,17,8,14, 12,20,10,8,14,12,20,10,8,14,12,20,10,8,14,12,20,10,3,5))

changes <- data2 %>%
    group_by(cities) %>%
    arrange(year, .by_group = TRUE) %>%
    mutate(variable_change = round((var2/lag(var2) - 1)*100, digits = 1))

But it calculates the percentage change between each year and I'm trying to calculate the changes between 2000 and 2002, 2000 and 2004 and so on...

like image 949
wanderzen Avatar asked Nov 27 '25 11:11

wanderzen


2 Answers

You can use match to get corresponding var2 where year = 2000 and divide it with var2 value in each city.

library(dplyr)

data %>%
  group_by(cities) %>%
  mutate(variable_change = var2/var2[match(2000, year)])

#  cities year   var2 variable_change
#   <chr>  <chr> <dbl>           <dbl>
# 1 NY     2000     12           1    
# 2 NY     2002     26           2.17 
# 3 NY     2004     17           1.42 
# 4 NY     2006      8           0.667
# 5 NY     2008     14           1.17 
# 6 PL     2000     12           1    
# 7 PL     2002     20           1.67 
# 8 PL     2004     10           0.833
# 9 PL     2006      8           0.667
#10 PL     2008     14           1.17 
# … with 15 more rows

We can use also use == if it is guaranteed to have only 1 year with value 2000 in each city.

data %>%
  group_by(cities) %>%
  mutate(variable_change = var2/var2[year == 2000])
like image 60
Ronak Shah Avatar answered Nov 30 '25 01:11

Ronak Shah


We can use %in% and it would also work when there are NAs

library(dplyr)
data %>%
      group_by(cities) %>%
      mutate(variable_change = var2/var2[year %in% 2000])
like image 37
akrun Avatar answered Nov 29 '25 23:11

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!