I have a data frame in R.
I'm trying to add/mutate a new column that maps several old character strings to new character strings using a map/translation/Rosetta Stone data frame that defines what strings I want to be replaced.
I was thinking something involving dplyr::mutate and some kind of function that applies gsub, but I just can't put it all together.
Starting Data Frame:
starting_df <- read.table(header=TRUE, text="
ID Genotype
VIT_123_1 0
ROM_456_2 0
VIT_78_1 1
BELG_910_1 1
")
Rosetta Stone Data Frame:
map_df <- read.table(header=TRUE, text="
ID New_ID
VIT VCO1
ROM VRO1
BELG VBE2
")
Desired Output Data Frame:
>head(updated_df)
ID Genotype New_ID
VIT_123_1 0 VCO1_123_1
ROM_456_2 0 VRO1_456_2
VIT_78_1 1 VCO1_78_1
BELG_910_1 1 VBE2_910_1
You can use str_replace_all from the stringr package.
First of all convert your map_df dataframe into a named vector:
map_v = as.character(map_df$New_ID)
names(map_v) = map_df$ID
Then replace the old values with new values:
library(stringr)
res = starting_df
res$New_ID = str_replace_all(starting_df$ID,map_v)
ID Genotype New_ID
1 VIT_123_1 0 VCO1_123_1
2 ROM_456_2 0 VRO1_456_2
3 VIT_78_1 1 VCO1_78_1
4 BELG_910_1 1 VBE2_910_1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With