Here's a trivial example of what I'm trying to do:
iris %>%
mutate(Species2 = ifelse(Species %in% c("setosa", "virginica"), "other", as.character(Species)) %>% as.factor) %>%
str
# 'data.frame': 150 obs. of 6 variables:
# $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# $ Species2 : Factor w/ 2 levels "Other","versicolor": 1 1 1 1 1 1 1 1 1 1 ...
However, if I want to do multiple merges, I'd end up with deeply nested ifelse statements, which I'm trying to avoid. What's the most elegant way to do this? Preferably I can incorporate the solution into a dplyr pipeline.
You can use match:
species.keep <- c("setosa", "virginica", "other")
iris %>% mutate(Species2 = species.keep[match(Species, species.keep, nomatch=3)])
We use the nomatch argument to match to map to "other" at the last position of our species.keep vector for any species that are not in previous positions. Note this assumes "other" is not a valid species. You'll have to add the as.factor etc., but this should get to what you want. match is the baseline mapping function in R.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With