I want to see how many email addresses contain the last name of the email's owner.
Each row in a dataframe contains a last name and an email address. I want to add a third column with a "yes" or a "no" indicating the presence of the last name in the email on that row.
Using a for loop works fine...but I can't help thinking there's probably a better R solution. Any suggestions on how make this more elegant?
vec1 <- c("foo", "smith")
vec2 <- c("[email protected]", "[email protected]")
df <- data.frame(vec1,vec2)
for(i in 1:nrow(df)) {
if (grepl(df$vec1[i], df$vec2[i]) == TRUE) {
df$lastNameInEmail[i] <- "Yes"
} else {
df$lastNameInEmail[i] <- "No"
}
}
vec1 vec2 lastNameInEmail
1 foo [email protected] Yes
2 smith [email protected] No
You can using stringr str_detect
stringr::str_detect(vec2,paste(vec1,collapse = '|'))
[1] TRUE FALSE
Here is a version using base R functions which works for more than the two given rows:
vec1 <- c("foo", "smith", "jones", "bar")
vec2 <- c("[email protected]", "[email protected]", "[email protected]", "[email protected]")
df <- data.frame(vec1,vec2)
df$lastNameInEmail <- sapply(1:nrow(df), function(x){ifelse(grepl(df$vec1[x], df$vec2[x])==TRUE, "Yes", "No")})
df
vec1 vec2 lastNameInEmail
1: foo [email protected] Yes
2: smith [email protected] No
3: jones [email protected] No
4: bar [email protected] Yes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With