I've calculated a different regression for each group in a data frame:
DF.L <- DF %>%
group_by(Channel) %>%
do(Fit = rlm(L ~ -1 + Y + I(Y^2), data = .))
I want to apply this set of regressions to another data frame. To do so, I'm testing how to apply it to the same data frame:
DF %>%
group_by(Channel) %>%
do({
Lfit <- predict(subset(DF.L, Channel == unique(.$Channel))$Fit, .)
data.frame(., Lfit)
})
glimpse(DF)
But I keep getting this error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "list"
Calls: %>% ... do_.grouped_df -> eval -> eval -> predict -> predict
What I am doing wrong?
Using the built-in ChickWeight data:
library(dplyr)
library(MASS)
library(broom)
library(tidyr)
library(ggplot2)
head(ChickWeight)
weight Time Chick Diet 1 42 0 1 1 2 51 2 1 1 3 59 4 1 1 4 64 6 1 1 5 76 8 1 1 6 93 10 1 1
ChickWeight_models <- ChickWeight %>%
group_by(Diet) %>%
do(fit = MASS::rlm(weight ~ Time + I(Time^2), data = .))
ChickWeight_models
Source: local data frame [4 x 2] Groups: <by row> # A tibble: 4 x 2 Diet fit * <fctr> <list> 1 1 <S3: rlm> 2 2 <S3: rlm> 3 3 <S3: rlm> 4 4 <S3: rlm>
So I've created a very similar object to your DF.L. It's a frame with the four groups, each with an rlm object in a list-column called fit.
Now I'll make up some data to test this model on. In this case, I'll just take the original data and add some noise to each of the variables.
ChickWeight_simulated <- ChickWeight %>%
mutate(Time = Time + runif(length(Time)),
weight = weight + rnorm(length(weight)))
ChickWeight_simulated
weight Time Chick Diet 1 42.72075 0.9786272 1 1 2 51.12669 2.8399631 1 1 3 58.64632 4.4576380 1 1 4 63.77617 6.1083591 1 1 5 75.40434 8.1051792 1 1 6 91.75830 10.7899030 1 1
Now we want to combine the dataframe of the models with the new data to test on. First we group_by and tidyr::nest the simulated data. This creates an object that is a dataframe with the four groups and a list-column called data, each element of which contains a rolled-up dataframe.
ChickWeight_simulated %>% group_by(Diet) %>% nest()
# A tibble: 4 x 2 Diet data <fctr> <list> 1 1 <tibble [220 x 3]> 2 2 <tibble [120 x 3]> 3 3 <tibble [120 x 3]> 4 4 <tibble [118 x 3]>
Then we can join it to the models dataframe:
ChickWeight_simulated %>% group_by(Diet) %>% nest() %>%
full_join(ChickWeight_models)
# A tibble: 4 x 3 Diet data fit <fctr> <list> <list> 1 1 <tibble [220 x 3]> <S3: rlm> 2 2 <tibble [120 x 3]> <S3: rlm> 3 3 <tibble [120 x 3]> <S3: rlm> 4 4 <tibble [118 x 3]> <S3: rlm>
Now we group by Diet again, and use broom::augment to make a prediction of each model on the new simulated data. Since each group is one row, there is one element each of fit and data; we have to extract that single element out of each list-column into a usable form by using [[1]].
ChickWeight_simulated_predicted <-
ChickWeight_simulated %>% group_by(Diet) %>% nest() %>%
full_join(ChickWeight_models) %>%
group_by(Diet) %>%
do(augment(.$fit[[1]], newdata = .$data[[1]]))
head(ChickWeight_simulated_predicted)
# A tibble: 6 x 6 # Groups: Diet [1] Diet weight Time Chick .fitted .se.fit <fctr> <dbl> <dbl> <ord> <dbl> <dbl> 1 1 42.72075 0.9786272 1 43.62963 2.368838 2 1 51.12669 2.8399631 1 51.80855 1.758385 3 1 58.64632 4.4576380 1 59.67606 1.534051 4 1 63.77617 6.1083591 1 68.43218 1.534152 5 1 75.40434 8.1051792 1 80.00678 1.647612 6 1 91.75830 10.7899030 1 97.26450 1.726331
To prove that this really only used the model from a particular level of Diet on the simulated data from that level of Diet, we can visualize the model fit.
ChickWeight_simulated_predicted %>%
ggplot(aes(Time, weight)) +
geom_point(shape = 1) +
geom_ribbon(aes(Time,
ymin = .fitted-1.96*.se.fit,
ymax = .fitted+1.96*.se.fit),
alpha = 0.5, fill = "black") +
geom_line(aes(Time, .fitted), size = 1, color = "red") +
facet_wrap(~Diet)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With