I have a dataframe column with numerous textual values (levels). I need to map those values to a predefined object-like structure in order to reduce the number of levels. I could easily achieve this in Python using a dictionary but could not do the same with a list in R.
For example, my dataframe column is something like:
df <- data.frame(weather = c('Clear','Snow','Clear','Rain','Rain','Other','Hail/sleet','Unknown'))
I need to map this to a list like
weather.levels <- list(
dry = c('Clear', 'Cloudy'),
wet = c('Snow', 'Rain', 'Hail/sleet'),
other = c('Other','Unknown'))
so that my transformed dataframe looks like
old.weather new.weather
1 Clear dry
2 Snow wet
3 Clear dry
4 Rain wet
5 Rain wet
6 Other1 other
7 Hail/sleet wet
8 Unknown other
I have looked at solutions like this and this, but these do not answer my question. I cannot create a dataframe to use R's match function because the number of levels in each category of the preset dictionary weather.levels ('dry', 'wet', 'other') are different.
As there often is, there is a base R function designed to do exactly this. levels<- is what you want:
Note that the df$weather variable needs to be a factor variable for this to work appropriately (the below code without explicitly changing to a factor first worked pre R 4.0 because df$weather was a factor by default in the data.frame call).
df$new.weather <- `levels<-`(df$weather, weather.levels)
## if variable not already a factor, instead:
df$new.weather <- `levels<-`(factor(df$weather), weather.levels)
df
# weather new.weather
#1 Clear dry
#2 Snow wet
#3 Clear dry
#4 Rain wet
#5 Rain wet
#6 Other other
#7 Hail/sleet wet
#8 Unknown other
In a slightly longer but simpler to read form this is equivalent to:
df$new.weather <- df$weather
levels(df$new.weather) <- weather.levels
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With