I have R data.table There are 25 columns, 1st column is ID, 24 columns are integer variables. There are close to 1Million rows. How do i convert all these non-zero values to 1
Custid A B C
123 0 8 0
124 0 0 6
Should become
Custid A B C
123 0 1 0
124 0 0 1
Assuming your data.table is called 'dt',
df = as.data.frame(dt)
df[,-1] = (df[,-1] != 0)*1
works. the -1 index indicates the non-inclusion of the first column, and then inside the parenthesis returns a true or false statement and *1 ensures that the result is returned in numeric form.
If you want you can turn it back to a data.table.
dt = data.table(df)
An efficient option is the set function which replace in place. Loop through the columns using a for loop and set the 'value' to 1 where the element is not equal to 0 by specifying the 'i' and 'j' index.
for(j in 2:ncol(dt)){
set(dt, i= which(dt[[j]]!=0), j=j, value =1)
}
dt
# Custid A B C
#1: 123 0 1 0
#2: 124 0 0 1
Or another option is with lapply by looping over the Subset of Data.table after specifying the .SDcols
dt[, names(dt)[-1] := lapply(.SD, function(x) as.integer(x!=0)), .SDcols = 2:ncol(dt)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With