I would like to apply a linear extrapolation for one year within a pipe. What I would like to do is very similar to this simple example without grouping. But within a pipe and employing dplyr::group_by()
. There are some examples like this one, this one or this one. But I can't manage to get the desirable output.
Reproducible example:
test.frame <- data.frame(Country =
rep(c("Austria", "Brazil", "Canada"), each = 3, times = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 9),
Year = rep(c(1990:1992), each = 9),
value = runif(27, 1,5))
test.frame2 <- data.frame(Country =
rep(c("Austria", "Brazil", "Canada"), each = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 3),
Year = rep(c(1993), each = 3),
value = 0)
results_frame <- test.frame %>%
dplyr::bind_rows(test.frame2)
I have two grouping categories (Country and Entity), and I would like to use the values for years 1990 to 1992 to fill those for year 1993 using linear extrapolation. On the basis of this, I can estimate the linear model:
linear_model <- test.frame %>%
dplyr::group_by(Country, Entity) %>%
lm(value ~ Year, data=.)
results <- predict.lm(linear_model, test.frame2)
But then, results
doesn't show the desirable output. So following the solution proposed here I try the following:
results_frame <- test.frame %>%
dplyr::group_by(Country, Entity) %>%
do(lm( value ~ Year , data = test.frame)) %>%
predict.lm(linear_model, test.frame2) %>%
bind_rows(test.frame)
But it doesn't work, instead I get
Error: Results 1, 2, 3, 4, 5, ... must be data frames, not lm
Any help would be highly appreciated!
You have to be careful to use the right data when fitting and predicting:
library(dplyr)
set.seed(42)
test.frame <- tibble(Country = rep(c("Austria", "Brazil", "Canada"), each = 3, times = 3),
Entity = rep(c("CO2","CH4","N2O"), times = 9),
Year = rep(c(1990:1992), each = 9),
value = runif(27, 1,5))
test.frame %>%
group_by(Country, Entity) %>%
do(lm( value ~ Year , data = .) %>%
predict(., tibble(Year = 1993)) %>%
tibble(Year = 1993, value = .)) %>%
bind_rows(test.frame)
#> # A tibble: 36 x 4
#> # Groups: Country, Entity [9]
#> Country Entity Year value
#> <fct> <fct> <dbl> <dbl>
#> 1 Austria CH4 1993 2.10
#> 2 Austria CO2 1993 2.03
#> 3 Austria N2O 1993 6.02
#> 4 Brazil CH4 1993 4.90
#> 5 Brazil CO2 1993 0.771
#> 6 Brazil N2O 1993 5.28
#> 7 Canada CH4 1993 4.69
#> 8 Canada CO2 1993 0.729
#> 9 Canada N2O 1993 1.49
#> 10 Austria CO2 1990 4.66
#> # ... with 26 more rows
You can do something like the following using nested data.frames. This solution is more general in that one do not need to re-create test.frame2
after the prediction and there could be more than one independent variables:
library(tidyverse)
test.frame %>%
group_by(Country, Entity) %>%
nest() %>%
inner_join(test.frame2 %>% select(-value) %>% group_by(Country, Entity) %>% nest(),
by = c("Country", "Entity")) %>%
mutate(model = data.x %>% map(~lm(value ~ Year, data=.)),
value = map2(model, data.y, predict)) %>%
select(-data.x, -model) %>%
unnest() %>%
bind_rows(test.frame, .)
Result:
Country Entity Year value
1 Austria CO2 1990 3.6245955
2 Austria CH4 1990 3.3857752
3 Austria N2O 1990 1.4798741
4 Brazil CO2 1990 2.5865668
5 Brazil CH4 1990 1.3271481
6 Brazil N2O 1990 4.4537926
7 Canada CO2 1990 4.7295768
8 Canada CH4 1990 4.5255033
9 Canada N2O 1990 2.3129381
10 Austria CO2 1991 4.8810838
11 Austria CH4 1991 4.9950455
12 Austria N2O 1991 2.1288504
13 Brazil CO2 1991 4.7767443
14 Brazil CH4 1991 2.0315449
15 Brazil N2O 1991 1.9307966
16 Canada CO2 1991 4.6831029
17 Canada CH4 1991 2.2761538
18 Canada N2O 1991 3.0856428
19 Austria CO2 1992 3.1223000
20 Austria CH4 1992 4.7715588
21 Austria N2O 1992 1.5733608
22 Brazil CO2 1992 2.9463442
23 Brazil CH4 1992 1.9569259
24 Brazil N2O 1992 1.4428006
25 Canada CO2 1992 3.0750847
26 Canada CH4 1992 1.4635521
27 Canada N2O 1992 2.8061861
28 Austria CO2 1993 3.3736976
29 Austria CH4 1993 5.7699101
30 Austria N2O 1993 1.8208485
31 Brazil CO2 1993 3.7963291
32 Brazil CH4 1993 2.4016508
33 Brazil N2O 1993 -0.4018621
34 Canada CO2 1993 2.5080960
35 Canada CH4 1993 -0.3068815
36 Canada N2O 1993 3.2281704
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With