Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data.table conditional merges

Tags:

r

data.table

I want to merge two data.tables using a condition on a variable in dt1. See the following example:

library(data.table)

# Example data tables
dt1 <- data.table(id = 1:5, var1 = c(1, 0, 1, 0, 1), name1 = c("A", "B", "C", "D", "E"))
dt2 <- data.table(name2 = c("A", "B", "C", "D", "E"), other_var = 101:105)
> dt1
   id var1 name1
1:  1    1     A
2:  2    0     B
3:  3    1     C
4:  4    0     D
5:  5    1     E
> dt2
   name2 other_var
1:     A       101
2:     B       102
3:     C       103
4:     D       104
5:     E       105

What I would like to do is merge dt1 and dt2 such that a value is merged only if var1 == 1 in dt1. Desired result:

> dt1
   id var1 name1  other_var
1:  1    1     A       101
2:  2    0     B        NA
3:  3    1     C       103
4:  4    0     D        NA
5:  5    1     E       105

I've tried something like:


dt1[var1 == 1][dt2, other_var := i.other_var, on = .(name1 = name2)] 

and

dt1[dt2, on = .(name1 = name2), other_var := i.other_var, var1 == 1]

but neither approach seems to work. What am I doing wrong?

like image 762
Juustomies6 Avatar asked Oct 27 '25 08:10

Juustomies6


2 Answers

You are expressing a condition by which you want to join, so set dt2[, var1 := 1] then join on var1 as well as well as name:

dt2[, var1 := 1]

dt1[dt2,
    on = .(name1 = name2, var1),
    other_var := i.other_var
][]

#       id  var1  name1 other_var
#    <int> <num> <char>     <int>
# 1:     1     1      A       101
# 2:     2     0      B        NA
# 3:     3     1      C       103
# 4:     4     0      D        NA
# 5:     5     1      E       105

This should be reasonably fast. If adding the column is a problem just remove it after the join (dt2[, var1 := NULL]).

like image 120
SamR Avatar answered Oct 28 '25 23:10

SamR


You can try

dt1[dt2,
    other_var := replace(other_var, !var1, NA),
    on = .(name1 = name2)
]

or

dt1[dt2,
    other_var := NA^(!var1) * other_var,
    on = .(name1 = name2)
]

which gives

   id var1 name1 other_var
1:  1    1     A       101
2:  2    0     B        NA
3:  3    1     C       103
4:  4    0     D        NA
5:  5    1     E       105
like image 41
ThomasIsCoding Avatar answered Oct 28 '25 22:10

ThomasIsCoding



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!