Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge two data.tables on the closest value, but also by group in R?

Tags:

r

data.table

I have this data:

library(data.table)

bioargo <- data.table(
  grp = c("a", "a", "b", "b"),
  val = 1:4,
  x = c(2.1, 2.2, 1.9, 3)
)

hplc <- data.table(
  x = c(2, 2.3),
  z = c("foo", "bar")
)

I would like to merge both data.tables on the nearest x value but by grp, so the output is as follows (i.e. for each row in hplc, get the closest x for each grp in bioargo):

data.table(
  x = c(2, 2.3),
  z = c("foo", "bar"),
  val = c(1, 3, 2, 2)
)
#>      x   z val
#> 1: 2.0 foo   1
#> 2: 2.3 bar   3
#> 3: 2.0 foo   2
#> 4: 2.3 bar   2

I have tried the following, but it is not returning what is required.

hplc[bioargo, on = "x", roll = "nearest"]
#>      x   z grp val
#> 1: 2.1 foo   a   1
#> 2: 2.2 bar   a   2
#> 3: 1.9 foo   b   3
#> 4: 3.0 bar   b   4
bioargo[hplc, on = "x", roll = "nearest"]
#>    grp val   x   z
#> 1:   b   3 2.0 foo
#> 2:   a   2 2.3 bar

Created on 2022-11-28 with reprex v2.0.2

like image 329
Philippe Massicotte Avatar asked Sep 06 '25 21:09

Philippe Massicotte


1 Answers

One way to solve your problem:

bioargo[, .SD[hplc, on="x", roll="nearest"], by=grp]

      grp   val     x      z
1:      a     1   2.0    foo
2:      a     2   2.3    bar
3:      b     3   2.0    foo
4:      b     3   2.3    bar
like image 199
B. Christian Kamgang Avatar answered Sep 10 '25 01:09

B. Christian Kamgang