I write and review a fair amount of R code like this:
df <- data.frame(replicate(10, sample(0:5, 10, rep = TRUE)))
my.func <- function(col, y) {col %in% y}
df$X2 <- my.func(df$X2, c(1,2))
df$X3 <- my.func(df$X3, c(4,5))
df$X5 <- my.func(df$X5, c(1,2))
df$X6 <- my.func(df$X6, c(4,5))
df$X8 <- my.func(df$X8, c(4,5))
df$X9 <- my.func(df$X9, c(1,2))
df$X10 <- my.func(df$X10, c(1))
That is, certain columns in a data.frame (or data.table) are transformed using a function, where one argument is a column and the other is some arbitrary, somewhat-unique-to-that-column value.
What's a more concise way to make such transformations?
I've tried using data.table's set (:=) operator, which makes things slightly cleaner, but still each column name must appear twice and the function must appear once for each column.
A concise way would be Map with the input arguments as the dataset ('df') and a list of vector that would be passed as argument to my.func. Here, each column of the data.frame is a unit and similarly the vector element from list.
df[] <- Map(my.func, df, list(1:2, 4:5, 3:4))
NOTE: The OP's function or a minimal reproducible example is not provided, so it is not tested
NOTE2: Here, the assumption is that the number of columns is 3. If it is more than 3, increase the length of the list as well
The above can also be converted to data.table syntax
library(data.table)
setDT(df)[, names(df) := Map(my.func, .SD, list(1:2, 4:5, 3:4))]
If only a subset of columns needs to be changed, specify the columns in .SDcols, and also change the names(df) to the subset of names
Or with tidyverse
library(tidyverse)
map2_dfc(df, list(1:2, 4:5, 3:4), my.func)
OP's request from a comment:
make the association between column names and function argument(s) for those columns more explicit
Adjusting the Map approach seen in the other answers:
yL <- list(X2 = 1:2, X3 = 4:5, X5 = 3:4, X6 = 4:5, X8 = 4:5, X9 = 1:2, X10 = 1)
df[names(yL)] <- Map(my.func, df[names(yL)], y = yL)
With data.table:
# this saves you from writing DT twice
DT[, names(yL) := Map(my.func, .SD, y = yL), .SDcols=names(yL)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With