Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

take mean of variable defined by string in dplyr

Tags:

r

dplyr

tidyverse


Seems like this should be easy but I'm stumped. I've gotten the rough hang of programming with dplyr 0.7, but struggling with this: How do I program in dplyr if the variable I want to program with will be a string?

I am scraping a database, and for a variety of reasons want to summarize a variable that I will know the position of but not the name of (the thing I want is always the first column of the supplied table, but the name of the variable stored in that column will vary depending on the database being scraped). To use iris as an example, suppose that I know that the variable that I want is in the first column

library(tidyverse)
desired_var <- colnames(iris)[1]
print(desired_var)
"Sepal.Length"

I now want to group by Species, and take the mean of desired_var, i.e. what I want is to perform

iris %>% 
group_by(Species) %>% 
summarise(desired_mean = mean(Sepal.Length))

But, now I want to take the mean of a column which is defined by a string stored in desired_var

I get how to do this with a "bare" Sepal.Length

desired_var <- quo(Sepal.Length)

iris %>% 
group_by(Species) %>% 
summarise(desired_mean = mean(!!desired_var))

But how in the world do I deal with the fact that I have "Sepal.Length" not Sepal.Length , i.e. that desired_var <- "Sepal.Length" ?

like image 274
DanO Avatar asked Dec 09 '25 13:12

DanO


2 Answers

You're wondering into tidyeval which is a rather new feature of the tidyverse (see here) more used to create functions using tidyverse functions. For now it is only available with dplyr but the plan is to extend it to the other tidyverse packages.

For your need though, you don't really need to get into that, when summarize_at will do. This function allows you to extend a particular manipulation that you specify across any variables of your choosing:

iris %>% 
  group_by(Species) %>% 
  summarise_at(vars(one_of("Sepal.Length", "Sepal.Width")), funs(desired_mean = mean))

# A tibble: 3 x 3
     Species Sepal.Length_desired_mean Sepal.Width_desired_mean
      <fctr>                     <dbl>                    <dbl>
1     setosa                     5.006                    3.428
2 versicolor                     5.936                    2.770
3  virginica                     6.588                    2.974

You can store the list of variables into a vector, and then use that vector instead:

selected_vectors <- c("Sepal.Length", "Sepal.Width")
iris %>% 
  group_by(Species) %>% 
  summarise_at(vars(one_of(selected_vectors)), funs(desired_mean = mean))
like image 200
Phil Avatar answered Dec 11 '25 16:12

Phil


1) dynamic variable with !!sym Use sym (or parse_expr) like this:

library(dplyr)
library(rlang)

desired_var <- "Sepal.Length"

iris %>% 
  group_by(Species) %>% 
  summarise(desired_mean = mean(!!sym(desired_var))) %>%
  ungroup

giving:

# A tibble: 3 x 2
     Species desired_mean
      <fctr>        <dbl>
1     setosa        5.006
2 versicolor        5.936
3  virginica        6.588

2) summarise_at As @Phil points out in the comments in the particular case of summarise this could be done like this without using any rlang facilities:

library(dplyr)

desired_var <- "Sepal.Length"

iris %>% 
   group_by(Species) %>% 
   summarise_at(desired_var, funs(mean)) %>%
   ungroup

giving:

# A tibble: 3 x 2
     Species Sepal.Length
      <fctr>        <dbl>
1     setosa        5.006
2 versicolor        5.936
3  virginica        6.588

3) dynamic variable and name with !! If you need to set the name dynamically in (1) then try this:

library(dplyr)
library(rlang)

desired_var <- "Sepal.Length"

desired_var_name <- paste("mean", desired_var, sep = "_")

iris %>% 
  group_by(Species) %>% 
  summarise(!!desired_var_name := mean(!!sym(desired_var))) %>%
  ungroup

giving:

# A tibble: 3 x 2
     Species mean_Sepal.Length
      <fctr>             <dbl>
1     setosa             5.006
2 versicolor             5.936
3  virginica             6.588
like image 25
G. Grothendieck Avatar answered Dec 11 '25 15:12

G. Grothendieck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!