Is it possible to store the order of rows in a data.table while preserving its keys?
Lets say I have the following dummy table:
library(data.table)
dt <- data.table(id=letters[1:6],
group=sample(c("red", "blue"), replace=TRUE),
value.1=rnorm(6),
value.2=runif(6))
setkey(dt, id)
dt
id group value.1 value.2
1: a blue 1.4557851 0.73249612
2: b red -0.6443284 0.49924102
3: c blue -1.5531374 0.72977197
4: d red -1.5977095 0.08033604
5: e blue 1.8050975 0.43553048
6: f red -0.4816474 0.23658045
I would like to store this table so that rows are ordered by group, and by value.1 in decreasing order, i.e:
> dt[order(group, value.1, decreasing=T),]
id group value.1 value.2
1: f red -0.4816474 0.23658045
2: b red -0.6443284 0.49924102
3: d red -1.5977095 0.08033604
4: e blue 1.8050975 0.43553048
5: a blue 1.4557851 0.73249612
6: c blue -1.5531374 0.72977197
Obviously I can save this as a new variable, but I also want to keep the id column as my primary key.
Arun's answer to "What is the purpose of setting a key in data.table?" suggests that this can be achieved with clever use setkey, since it orders the data.table in the order of its keys (although there is no option to set the key to decreasing order):
> setkey(dt, group, value.1, id)
> dt
id group value.1 value.2
1: c blue -1.5531374 0.72977197
2: a blue 1.4557851 0.73249612
3: e blue 1.8050975 0.43553048
4: d red -1.5977095 0.08033604
5: b red -0.6443284 0.49924102
6: f red -0.4816474 0.23658045
However, I lose the ability to use id as my primary key, because group is the first key provided:
> dt["a"]
group id value.1 value.2
1: a NA NA NA
Sounds like you simply want to modify print.data.table:
print.data.table = function(x, ...) {
# put whatever condition identifies your tables here
if ("group" %in% names(x) && "value.1" %in% names(x)) {
data.table:::print.data.table(x[order(group, value.1, decreasing = T)], ...)
} else {
data.table:::print.data.table(x, ...)
}
}
set.seed(2)
dt = data.table(id=letters[1:6],
group=sample(c("red", "blue"), replace=TRUE),
value.1=rnorm(6),
value.2=runif(6))
setkey(dt, id)
dt
# id group value.1 value.2
#1: a red 0.18484918 0.40528218
#2: e red 0.13242028 0.44480923
#3: c red -1.13037567 0.97639849
#4: b blue 1.58784533 0.85354845
#5: f blue 0.70795473 0.07497942
#6: d blue -0.08025176 0.22582546
dt["c"]
# id group value.1 value.2
#1: c red -1.130376 0.9763985
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With