I would like to call tidyr::gather() inside a custom function, to which I pass a pair of character variables that will be used to rename the key and value columns. e.g.
myFunc <- function(mydata, key.col, val.col) {
new.data <- tidyr::gather(data = mydata, key = key.col, value = val.col)
return(new.data)
}
However, this does not work as desired.
temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45), day.3 = c(17, 9, 33))
# Call my custom function, renaming the key and value columns
# "day" and "temp", respectively
long.data <- myFunc(mydata = temp.data, key.col = "day", val.col = "temp")
# Columns have *not* been renamed as desired
head(long.data)
key.col val.col
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
Desired output:
head(long.data)
day temp
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
My understanding is that gather() uses bare variable names for most arguments (as it has in this example, using "key.col" as the column name as opposed to the value stored in key.col). I have attempted a number of ways of passing a value in the gather() call, but most return errors. For example, these three variants of the gather() call within myFunc return Error: Invalid column specification (ignoring, for illustrative purposes, the value parameter, which has identical behavior):
gather(data = mydata, key = as.character(key.col) value = val.col)
gather(data = mydata, key = as.name(key.col) value = val.col)
gather(data = mydata, key = as.name(as.character(key.col)) value = val.col)
As a workaround, I just rename the columns following the call to gather():
colnames(long.data)[colnames(long.data) == "key"] <- "day"
But given gather()'s purported functionality for renaming the key/value columns, how can I do this in the gather() call within a custom function?
To put it in a function you have to use gather_() like so.
myFunc <- function(mydata, key.col, val.col, gather.cols) {
new.data <- gather_(data = mydata,
key_col = key.col,
value_col = val.col,
gather_cols = colnames(mydata)[gather.cols])
return(new.data)
}
temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45),
day.3 = c(17, 9, 33))
temp.data
day.1 day.2 day.3
1 20 32 17
2 22 22 9
3 23 45 33
# Call my custom function, renaming the key and value columns
# "day" and "temp", respectively
long.data <- myFunc(mydata = temp.data, key.col = "day", val.col =
"temp", gather.cols = 1:3)
# Columns *have* been renamed as desired
head(long.data)
day temp
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
As stated, the main difference is in gather_ you have to specify the columns you want to gather up with the gather_cols argument.
...and having had the same question, I now found the answer here: https://dplyr.tidyverse.org/articles/programming.html
You can have dplyr evaluate symbols by setting them off with exclamation marks. In your original question, the code would be:
gather(data = mydata, key = !!key.col value = !!val.col)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With