Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing column name within the SD construct

Tags:

r

data.table

I have a data table in R that looks like this

DT = data.table(a = c(1,2,3,4,5), a_mean = c(1,1,2,2,2), b = c(6,7,8,9,10), b_mean = c(3,2,1,1,2))

I want to create two more columns a_final and b_final defined as a_final = (a - a_mean) and b_final = (b - b_mean). In my real life use case, there can be a large number of such column pairs and I want a scalable solution in the spirit of R's data tables.

I tried something along the lines of

DT[,paste0(c('a','b'),'_final') := lapply(.SD, function(x) ((x-get(paste0(colnames(.SD),'_mean'))))), .SDcols = c('a','b')]

but this doesn't quite work. Any idea of how I can access the column name of the column being processed within the lapply statement?

like image 848
Dinesh Avatar asked Aug 31 '25 02:08

Dinesh


2 Answers

We can create a character vector with columns names, subset it from the original data.table, get their corresponding "mean" columns, subtract and add as new columns.

library(data.table)
cols <- unique(sub('_.*', '', names(DT))) #Thanks to @Sotos
#OR just
#cols <- c('a', 'b')

DT[,paste0(cols, '_final')] <- DT[,cols, with = FALSE] - 
                               DT[,paste0(cols, "_mean"), with = FALSE] 
DT
#   a a_mean  b b_mean a_final b_final
#1: 1      1  6      3       0       3
#2: 2      1  7      2       1       5
#3: 3      2  8      1       1       7
#4: 4      2  9      1       2       8
#5: 5      2 10      2       3       8
like image 144
Ronak Shah Avatar answered Sep 02 '25 17:09

Ronak Shah


Another option is using mget with Map:

cols <- c('a', 'b')
DT[, paste0(cols,'_final') := Map(`-`, mget(cols), mget(paste0(cols,"_mean")))]
like image 29
chinsoon12 Avatar answered Sep 02 '25 17:09

chinsoon12