Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Transform dataframe column using dictionary/list?

I have a dataframe column with numerous textual values (levels). I need to map those values to a predefined object-like structure in order to reduce the number of levels. I could easily achieve this in Python using a dictionary but could not do the same with a list in R.

For example, my dataframe column is something like:

df <- data.frame(weather = c('Clear','Snow','Clear','Rain','Rain','Other','Hail/sleet','Unknown'))

I need to map this to a list like

weather.levels <- list(
  dry = c('Clear', 'Cloudy'),
  wet = c('Snow', 'Rain', 'Hail/sleet'),
  other = c('Other','Unknown'))

so that my transformed dataframe looks like

    old.weather new.weather
1       Clear         dry
2        Snow         wet
3       Clear         dry
4        Rain         wet
5        Rain         wet
6      Other1       other
7  Hail/sleet         wet
8     Unknown       other

I have looked at solutions like this and this, but these do not answer my question. I cannot create a dataframe to use R's match function because the number of levels in each category of the preset dictionary weather.levels ('dry', 'wet', 'other') are different.

like image 411
emphasent Avatar asked Dec 18 '25 08:12

emphasent


1 Answers

As there often is, there is a base R function designed to do exactly this. levels<- is what you want:

Note that the df$weather variable needs to be a factor variable for this to work appropriately (the below code without explicitly changing to a factor first worked pre R 4.0 because df$weather was a factor by default in the data.frame call).

df$new.weather <- `levels<-`(df$weather, weather.levels)
## if variable not already a factor, instead:
df$new.weather <- `levels<-`(factor(df$weather), weather.levels)
df
#     weather new.weather
#1      Clear         dry
#2       Snow         wet
#3      Clear         dry
#4       Rain         wet
#5       Rain         wet
#6      Other       other
#7 Hail/sleet         wet
#8    Unknown       other

In a slightly longer but simpler to read form this is equivalent to:

df$new.weather <- df$weather
levels(df$new.weather) <- weather.levels
like image 168
thelatemail Avatar answered Dec 19 '25 21:12

thelatemail



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!