Mapping columns from two dataframes in R

Question

I have two dataframes with several columns, but for simplicity sake lets say my two dataframes are as such:

## Dataframe #1
df1 <- data.frame(name = c("Jake", "Paul", "Luis", "Leon"),
                 salary_new = c(60, 80, 90, 50))

## Dataframe #2
df2 <- data.frame(name = c("Paul Henderson", "John F. Smith", "Leon K.", "Luis Sierra"),
                         salary_old = c(60, 55, 60, 80))

I would like to do a sort of mapping between the two name columns where I would replace the names in DF1 with those in DF2 to produce the following results:

df3 <- data.frame(name = c("Jake", "Paul Henderson", "Luis Sierra", "Leon K."),
                  salary_new = c(60, 80, 90, 50))

What I am having trouble with is that each dataframe has names that don't belong to the other and the ones that do belong have a slight variation to them (include a last name, a middle initial). Is there a way to do a fuzzy matching of sorts to get the desired output and leave the names that aren't in DF2 the same as they appear in DF1?

akrun · Accepted Answer

We could extract the first word and do a join

library(dplyr)
library(stringr)
df2 %>%
   mutate(name2 =word(name, 1)) %>%
   left_join(df1, ., by = c("name" = "name2")) %>%
   transmute(name = coalesce( name.y, name), salary_new)

-output

            name salary_new
1           Jake         60
2 Paul Henderson         80
3    Luis Sierra         90
4        Leon K.         50

Mapping columns from two dataframes in R

Tags:

string-matching

pattern-matching

r

Student

1 Answers

akrun

Recent Activity

Donate For Us

Mapping columns from two dataframes in R

Tags:

string-matching

pattern-matching

r

Student

1 Answers

akrun

Related questions

Recent Activity

Donate For Us