Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mapping columns from two dataframes in R

I have two dataframes with several columns, but for simplicity sake lets say my two dataframes are as such:

## Dataframe #1
df1 <- data.frame(name = c("Jake", "Paul", "Luis", "Leon"),
                 salary_new = c(60, 80, 90, 50))

## Dataframe #2
df2 <- data.frame(name = c("Paul Henderson", "John F. Smith", "Leon K.", "Luis Sierra"),
                         salary_old = c(60, 55, 60, 80))

I would like to do a sort of mapping between the two name columns where I would replace the names in DF1 with those in DF2 to produce the following results:

df3 <- data.frame(name = c("Jake", "Paul Henderson", "Luis Sierra", "Leon K."),
                  salary_new = c(60, 80, 90, 50))

What I am having trouble with is that each dataframe has names that don't belong to the other and the ones that do belong have a slight variation to them (include a last name, a middle initial). Is there a way to do a fuzzy matching of sorts to get the desired output and leave the names that aren't in DF2 the same as they appear in DF1?

like image 501
Student Avatar asked Nov 28 '25 19:11

Student


1 Answers

We could extract the first word and do a join

library(dplyr)
library(stringr)
df2 %>%
   mutate(name2 =word(name, 1)) %>%
   left_join(df1, ., by = c("name" = "name2")) %>%
   transmute(name = coalesce( name.y, name), salary_new)

-output

            name salary_new
1           Jake         60
2 Paul Henderson         80
3    Luis Sierra         90
4        Leon K.         50
like image 108
akrun Avatar answered Nov 30 '25 10:11

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!