Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retain nesting variable when using select on nested tibble

Tags:

r

dplyr

purrr

tidyr

I am using the code from this question (below) to save columns of nested tibble into a new list of tibbles (each column being a tibble in the list). However, when using selected on the nested tibble, the nested variable is lost. Which I'd like to retain, it keeps the grouping variable with the results.

e.g., results %>% unnest(tidied) keeps "carb", but 'results %>% select(tidied) %>% map(~bind_rows(.))' does not.

How can I keep the nested variable with the selected columns?

library(tidyverse)
library(broom)
data(mtcars)
df <- mtcars

nest.df <- df %>% nest(-carb) 

results <- nest.df %>% 
  mutate(fit = map(data, ~ lm(mpg ~ wt, data=.x)),
         tidied = map(fit, tidy),
         glanced = map(fit, glance),
         augmented = map(fit, augment))

final <- results %>% select(glanced, tidied, augmented ) %>% 
        map(~bind_rows(.))
like image 497
nofunsally Avatar asked Oct 25 '25 06:10

nofunsally


1 Answers

We can do a mutate_at before the select step (not clear about the expected output though). Here mutate_at in looping through each column, but these columns are also tibble, so inside the function (list(~), we use map2 to pass the column and the 'carb' column, then create a new column with the list tibble column by mutateing with new column 'carb'

results %>%
  mutate_at(vars(glanced, tidied, augmented), 
          list(~ map2(.,carb, ~ .x %>% mutate(carb = .y)))) %>% 
  select(glanced, tidied, augmented) %>% 
  map(~ bind_rows(.x))
$glanced
# A tibble: 6 x 12
#  r.squared adj.r.squared  sigma statistic   p.value    df logLik    AIC    BIC deviance df.residual  carb
#      <dbl>         <dbl>  <dbl>     <dbl>     <dbl> <int>  <dbl>  <dbl>  <dbl>    <dbl>       <int> <dbl>
#1   0.696           0.658   2.29  18.3      0.00270      2 -21.4    48.7   49.6    41.9            8     4
#2   0.654           0.585   3.87   9.44     0.0277       2 -18.2    42.4   42.3    74.8            5     1
#3   0.802           0.777   2.59  32.3      0.000462     2 -22.6    51.1   52.1    53.5            8     2
#4   0.00295        -0.994   1.49   0.00296  0.965        2  -3.80   13.6   10.9     2.21           1     3
#5   0               0     NaN     NA       NA            1 Inf    -Inf   -Inf       0              0     6
#6   0               0     NaN     NA       NA            1 Inf    -Inf   -Inf       0              0     8

#$tidied
# A tibble: 10 x 6
#   term        estimate std.error statistic      p.value  carb
#   <chr>          <dbl>     <dbl>     <dbl>        <dbl> <dbl>
# 1 (Intercept)   27.9       2.91     9.56     0.0000118      4
# 2 wt            -3.10      0.724   -4.28     0.00270        4
#...
#...
like image 111
akrun Avatar answered Oct 26 '25 19:10

akrun