I want to change factor levels of a column using setattr. However, when the column is selected the standard data.table way (dt[ , col]), the levels are not updated. On the other hand, when selecting the column in an unorthodox way in a data.table setting—namely using $—it works.
library(data.table)
# Some data
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# We want to change levels of 'x' using setattr
# New desired levels
lev <- c("a_new", "b_new")
# Select column in the standard data.table way
setattr(x = d[ , x], name = "levels", value = lev)
# Levels are not updated
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# Select column in a non-standard data.table way using $
setattr(x = d$x, name = "levels", value = lev)
# Levels are updated
d
# x y
# 1: b_new 1
# 2: a_new 2
# 3: a_new 3
# 4: b_new 4
# Just check if d[ , x] really is the same as d$x
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
identical(d[ , x], d$x)
# [1] TRUE
# Yes, it seems so
It feels like I'm missing some data.table (R?) basics here. Can anyone explain what's going on?
I have found two other post on setattr and levels:
setattr on levels preserving unwanted duplicates (R data.table)
How does one change the levels of a factor column in a data.table
Both of them used $ to select the column. Neither of them mentioned the [ , col] way.
It might help to understand if you look at the address from both expressions:
address(d$x)
# [1] "0x10e4ac4d8"
address(d$x)
# [1] "0x10e4ac4d8"
address(d[,x])
# [1] "0x105e0b520"
address(d[,x])
# [1] "0x105e0a600"
Note that the address from the first expression doesn't change when you call it multiple times, while the second expression does which indicates it is making a copy of the column due to the dynamic nature of the address, so setattr on it will have no effect on the original data.table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With