Say that I have these data:
library(dplyr)
df1 <- data.frame(x = c(1, 2, 3, 4), z = c("A", "A", "B", "B"))
df2 <- data.frame(x = c(2, 4, 6, 8), z = c("A", "A", "B", "C"))
I can easily check if each element of x in df1 is present in x of df2:
df1 <- df1 %>% mutate(present = x %in% df2$x)
Is there an easy way to do the same thing (preferable in the tidyverse), but to only check within group?
In other words, for an observation in df1 to have present be TRUE, two things must be true: 1) the group (z) in df2 must be the same as the group in df1 and 2) the value of x in df2 must be the same as the value in df1.
So, only the second observation (2) would be TRUE because there exists an observation in df2 with an x of 2 and a z of A. The last observation of x would be FALSE because even though there is a value in df2 with value 4, this observation is in group A, not B.
An approach with inner_join
Edit, now works with multiple matches and removes the use of a temporary variable
library(dplyr)
bind_rows(df1, inner_join(df1, df2), .id="id") %>%
summarize(present = n() > 1 & var(id) > 0, .by = -id)
output
x z present
1 1 A FALSE
2 2 A TRUE
3 3 B FALSE
4 4 B FALSE
library (tidyverse)
df1 %>%
left_join(df2 %>% mutate(present = T)) %>%
replace_na(list("present"= F))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With