Mutate a column of models: "Error: Problem with `mutate()` input `model`. x Input `model` must be a vector, not a `lm` object."

Question

I have a dataframe that contains as a column a model formula definition. I would like to mutate a new column where each row is a model based on the corresponding rows model definition.

Some data:

# Set up
library(tidyverse)
library(lubridate)


# Create data
mydf <- data.frame(
  cohort = seq(ymd('2019-01-01'), ymd('2019-12-31'), by = '1 days'),
  n = rnorm(365, 1000, 50) %>% round,
  cohort_cost = rnorm(365, 800, 50)
) %>% 
  crossing(tenure_days = 0:365) %>% 
  mutate(activity_date = cohort + days(tenure_days)) %>% 
  mutate(daily_revenue = rnorm(nrow(.), 20, 1)) %>% 
  group_by(cohort) %>% 
  arrange(activity_date) %>% 
  mutate(cumulative_revenue = cumsum(daily_revenue)) %>% 
  arrange(cohort, activity_date) %>% 
  mutate(payback_velocity = round(cumulative_revenue / cohort_cost, 2)) %>% 
  select(cohort, n, cohort_cost, activity_date, tenure_days, everything())

## wider data
mydf_wide <- mydf %>% 
  select(cohort, n, cohort_cost, tenure_days, payback_velocity) %>% 
  group_by(cohort, n, cohort_cost) %>% 
  pivot_wider(names_from = tenure_days, values_from = payback_velocity, names_prefix = 'velocity_day_')

Now, the final problem code block. It fails on the very last line:

models <- data.frame(
  from = mydf$tenure_days %>% unique,
  to = mydf$tenure_days %>% unique
) %>% 
  expand.grid %>% 
  filter(to > from) %>% 
  filter(from > 0) %>% 
  arrange(from) %>% 
  mutate(mod_formula = paste0('velocity_day_', to, ' ~ velocity_day_', from)) %>% 
  mutate(model = lm(as.formula(mod_formula), data = mydf_wide))

Error: Problem with mutate() input model. x Input model must be a vector, not a lm object. ℹ Input model is lm(as.formula(mod_formula), data = mydf_wide).

If I run the last code block minus the last line and take a look at the resulting data frame 'models' it looks like this:

models %>% head
  from to                     mod_formula
1    1  2 velocity_day_2 ~ velocity_day_1
2    1  3 velocity_day_3 ~ velocity_day_1
3    1  4 velocity_day_4 ~ velocity_day_1
4    1  5 velocity_day_5 ~ velocity_day_1
5    1  6 velocity_day_6 ~ velocity_day_1
6    1  7 velocity_day_7 ~ velocity_day_1

I tried making it a list column, but to do that as far as I'm aware I need to group by. But in this case I need to group by everything. I amended the last code block:

models <- data.frame(
  from = mydf$tenure_days %>% unique,
  to = mydf$tenure_days %>% unique
) %>% 
  expand.grid %>% 
  filter(to > from) %>% 
  filter(from > 0) %>% 
  arrange(from) %>% 
  mutate(mod_formula = paste0('velocity_day_', to, ' ~ velocity_day_', from)) %>% 
  group_by_all() %>% 
  nest() %>% 
  mutate(model = lm(as.formula(mod_formula), data = mydf_wide))

However this results in the same error.

How can I add a new column onto 'models' that contains a linear model for each row based on the formula in field 'mod_formula'?

Ronak Shah · Accepted Answer

lm is not vectorized. Add rowwise to create a model for each row.

library(dplyr)

models <- data.frame(
  from = mydf$tenure_days %>% unique,
  to = mydf$tenure_days %>% unique
) %>% 
  expand.grid %>% 
  filter(to > from) %>% 
  filter(from > 0) %>% 
  arrange(from) %>% 
  mutate(mod_formula = paste0('velocity_day_', to, ' ~ velocity_day_', from)) %>%
  rowwise() %>%
  mutate(model = list(lm(as.formula(mod_formula), data = mydf_wide)))

models

#  from    to mod_formula                     model 
#  <int> <int> <chr>                           <list>
#1     1     2 velocity_day_2 ~ velocity_day_1 <lm>  
#2     1     3 velocity_day_3 ~ velocity_day_1 <lm>  
#3     1     4 velocity_day_4 ~ velocity_day_1 <lm>  
#4     1     5 velocity_day_5 ~ velocity_day_1 <lm>  
#5     1     6 velocity_day_6 ~ velocity_day_1 <lm>  
#6     1     7 velocity_day_7 ~ velocity_day_1 <lm>  
#...
#...

You can also use map instead of rowwise.

mutate(model = purrr::map(mod_formula, ~lm(.x, data = mydf_wide)))

Mutate a column of models: "Error: Problem with `mutate()` input `model`. x Input `model` must be a vector, not a `lm` object."

Tags:

r

Doug Fir

1 Answers

Ronak Shah

Recent Activity

Donate For Us

Mutate a column of models: "Error: Problem with `mutate()` input `model`. x Input `model` must be a vector, not a `lm` object."

Tags:

r

Doug Fir

1 Answers

Ronak Shah

Related questions

Recent Activity

Donate For Us