In SQL, you can join by multiple columns specified with OR. I need it to bring in new columns from a lookup, where there are multiple identifiers that might match.
For example, joining x and y by columns a and b.
x <- tibble::tribble(
~a, ~b, ~old_info,
1, 10, "old1",
2, 20, "old2",
3, 30, "old3",
4, 40, "old4",
5, 50, "old5"
)
y <- tibble::tribble(
~a, ~b, ~new_info,
1, NA, "new1",
2, NA, "new2",
NA, NA, "new3",
NA, 40, "new4",
NA, 50, "new5"
)
This would be the desired output.
#> # A tibble: 5 × 4
#> a b old_info new_info
#> <dbl> <dbl> <chr> <chr>
#> 1 1 10 old1 new1
#> 2 2 20 old2 new2
#> 3 3 30 old3 NA
#> 4 4 40 old4 new4
#> 5 5 50 old5 new5
Maybe even two joins and a coalesce?
x %>%
left_join(y, by = c(a = "a")) %>%
left_join(y, by = c(b.x = "b")) %>%
mutate(new_info = coalesce(new_info.x, new_info.y)) %>%
select(a = a.x, b = b.x, old_info, new_info)
# A tibble: 5 × 4
a b old_info new_info
<dbl> <dbl> <chr> <chr>
1 1 10 old1 new1
2 2 20 old2 new2
3 3 30 old3 NA
4 4 40 old4 new4
5 5 50 old5 new5
Try
library(powerjoin)
power_left_join(x, y, by = ~ dplyr::if_else(.x$a == .y$a | .x$b == .y$b, TRUE, FALSE, FALSE),
keep = "left")
# A tibble: 5 × 4
a b old_info new_info
<dbl> <dbl> <chr> <chr>
1 1 10 old1 new1
2 2 20 old2 new2
3 3 30 old3 NA
4 4 40 old4 new4
5 5 50 old5 new5
from here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With