Here is a code to generate a data.frame :
ref_variables=LETTERS[1:10]
row=100
d0=seq(1:100)
for (i in seq_along(ref_variables)){
dtemp=sample(seq(1:row),row,TRUE)
d0=data.frame(d0,dtemp)
}
d0[,1]=NULL
names(d0)=ref_variables
I have a dataset, data.frame or data.table, whatever. Let's say I want to modify the columns 2 to 4 by dividing each of them by the first one. Of Course, I can make a loop like this :
columns_name_to_divide=c("B","C","H")
column_divisor="A"
for (i in seq_along(columns_name_to_divide)){
ds[columns_name_to_divide[i]] = ds[columns_name_to_divide[i]] / ds[column_divisor]
}
But is there a way more elegant to do it?
> d0[2:4] <- d0[,2:4]/d0[,1]
This will substitute your original values with result you get after dividing column 2,3,4 by column 1. Rest will remain the same.
If you want to create 3 new columns in d0 with new values after dividing column 2,3,4 by column 1 This will not replace the original values in column 2,3, and 4. The calculated values would be in column 11,12 and 13 respectively.
> dim(d0)
# [1] 100 10
> d0[11:13] <- d0[,2:4]/d0[,1]
> dim(d0)
# [1] 100 13
To round up the new values, you can simply add round() function to 2 decimal places like below:
> d0[2:4] <- round(d0[,2:4]/d0[,1],2) # Original values subtituted at 2,3,4
# OR
> d0[11:13] <- round(d0[,2:4]/d0[,1],2) # New columns added, original columns are untouched.
We can use set from data.table which would be make this more efficient as the overhead of .[data.table is avoided when called multiple times (though not in this case).
library(data.table)
setDT(d0)
for(j in columns_name_to_divide){
set(d0, i = NULL, j = j, value = d0[[j]]/d0[[column_divisor]])
}
Or using lapply
setDT(d0)[, (columns_name_to_divide) := lapply(.SD, `/`,
d0[[column_divisor]]), .SDcols = columns_name_to_divide]
Or an elegant option using dplyr
library(dplyr)
library(magrittr)
d0 %<>%
mutate_each_(funs(./d0[[column_divisor]]), columns_name_to_divide)
head(d0)
# A B C D E F G H I J
#1 60 0.4000000 1.1500000 6 86 27 19 0.150000 94 97
#2 11 0.6363636 0.3636364 25 52 44 82 8.818182 84 68
#3 80 0.8750000 1.1375000 72 34 56 69 0.125000 34 17
#4 77 0.3116883 1.0259740 9 44 87 61 1.064935 79 40
#5 18 0.3333333 5.0555556 60 69 62 89 2.166667 21 34
#6 42 1.3333333 2.3095238 61 20 87 95 1.428571 78 63
set.seed(42)
d1 <- as.data.frame(matrix(sample(1:9, 1e7*7, replace=TRUE), ncol=7))
d2 <- copy(d1)
d3 <- copy(d1)
system.time({
d2 %<>%
mutate_each(funs(./d2[["V2"]]), V4:V7)
})
# user system elapsed
# 0.52 0.39 0.91
system.time({
d1[,4:7] <- d1[,4:7]/d1[,2]
})
# user system elapsed
# 1.72 0.72 2.44
system.time({
setDT(d3)
for(j in 4:7){
set(d3, i = NULL, j = j, value = d3[[j]]/d3[["V2"]])
}
})
# user system elapsed
# 0.32 0.16 0.47
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With