Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programming with dplyr: Renaming a column with variable using glue syntax

I've read through Programming with dplyr and understand that rename() and select() use tidy selection. I'm trying to combine this with the glue syntax to create a custom function using the new double curly syntax (rlang v0.4.0), however I'm getting extra quotation marks:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

sel_var = "homeworld"

# Attempt at using (newer) double curly syntax:
starwars %>% 
  select("{{sel_var}}_old" := {{ sel_var }})
#> # A tibble: 87 x 1
#>    `"homeworld"_old`
#>    <chr>            
#>  1 Tatooine               
#> # ... with 77 more rows

# Working, but uglier (and older) bang bang syntax:
starwars %>% 
  select(!!sym(paste0(sel_var, "_old")) := {{ sel_var }})
#> # A tibble: 87 x 1
#>    homeworld_old
#>    <chr>        
#>  1 Tatooine          
#> # ... with 77 more rows

Created on 2021-02-16 by the reprex package (v0.3.0)

How can I avoid the extra quotations marks in `"homeworld"_old` using the double curly {{ }} and glue := syntax? This is shown to work for summarise("mean_{{expr}}" := mean({{ expr }}), ...) in a function here.

like image 964
Alwin Avatar asked Oct 28 '25 08:10

Alwin


2 Answers

The {{ operator inside the glue mechanism works at the level of expressions, not strings. When an expression contains a string, the quotes (") are also a part of that same expression, which is why you see them in the output. If you convert your string to a variable name, everything should work as expected:

sel_var <- as.name("homeworld")

starwars %>% 
  select("{{sel_var}}_old" := {{ sel_var }})
# # A tibble: 87 x 1
#    homeworld_old
#    <chr>        
#  1 Tatooine     
#  2 Tatooine     
# ...

NOTE that the summarise("mean_{{expr}}" := mean({{ expr }}), ...) example you linked has the same property. For example, here's one of the functions defined in that vignette:

my_summarise5 <- function(data, mean_var, sd_var) {
  data %>% 
    summarise(
      "mean_{{mean_var}}" := mean({{ mean_var }}), 
      "sd_{{sd_var}}" := mean({{ sd_var }})
    )
}

Everything works as expected when you pass variable names to the function:

my_summarise5( mtcars, mpg, mpg )
#   mean_mpg   sd_mpg
# 1 20.09062 20.09062

However, passing strings will include " in the output, as in your case:

my_summarise5( mtcars, "mpg", "mpg" )
#   mean_"mpg" sd_"mpg"
# 1         NA       NA
# Warning messages:
# 1: In mean.default(~"mpg") :
#   argument is not numeric or logical: returning NA
# 2: In mean.default(~"mpg") :
#   argument is not numeric or logical: returning NA
like image 67
Artem Sokolov Avatar answered Oct 30 '25 23:10

Artem Sokolov


The value inside {{}} should be unquoted to be evaluated so create the column name before using them.

Here are two ways :

  1. Using curly-curly ({{}}).
library(dplyr)
library(rlang)

sel_var = 'homeworld'
new_col <- paste0(sel_var, '_old')

starwars %>% 
  select({{ sel_var }}) %>% 
  rename({{new_col}} := {{ sel_var }})

# A tibble: 87 x 1
#   homeworld_old
#   <chr>        
# 1 Tatooine     
# 2 Tatooine     
# 3 Naboo        
# 4 Tatooine     
# 5 Alderaan     
# 6 Tatooine     
# 7 Tatooine     
# 8 Tatooine     
# 9 Tatooine     
#10 Stewjon      
# … with 77 more rows
  1. Using bang-bang (!!) will return the same output.
starwars %>% 
  select({{ sel_var }}) %>% 
  rename(!!new_col := {{ sel_var }})
like image 43
Ronak Shah Avatar answered Oct 31 '25 01:10

Ronak Shah