Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recoding an entire data frame using label-value pairs stored in a different dataframe

Suppose I have a dataframe:

df <- data.frame(
  x = c("Yes","No","Yes","Don't Know"),
  y = c("Male","Female","Refused","Male"),
  z = c("Employed", "Unemployed","Employed","Employed")
)

I want to recode the values in the entire dataframe to numeric values and have the codes stored in a separate dataframe:

codes <- data.frame(
  variable = c(rep("x",3),rep("y",3),rep("z",2)),
  labels = c("Yes","No","Don't Know","Male","Female","Refused","Employed", "Unemployed"),
  codes = c(1,2,98,1,2,99,1,2)
)

I want to output to look like:

> df
   x  y z
1  1  1 1
2  2  2 2
3  1 99 1
4 98  1 1

How can I approach this problem? A tidyverse solution would be very helpful.

like image 464
Ahmad Noman Alnoor Avatar asked Oct 25 '25 18:10

Ahmad Noman Alnoor


2 Answers

An option would be to bring the data in long format join with codes dataset and move it back to wide format.

Using tidyverse this can be done as following :

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = everything()) %>%
  left_join(codes, join_by(name == variable, value == labels)) %>%
  select(-value) %>%
  pivot_wider(names_from = name, values_from = codes, values_fn = list) %>%
  unnest_longer(everything())

# A tibble: 4 × 3
#      x     y     z
#  <dbl> <dbl> <dbl>
#1     1     1     1
#2     2     2     2
#3     1    99     1
#4    98     1     1
like image 64
Ronak Shah Avatar answered Oct 28 '25 08:10

Ronak Shah


In tidyverse, you can try

df %>%
    mutate(across(everything(), ~{with(
        filter(codes, variable == cur_column()),
        codes[match(.x, labels)]) }))

which gives

   x  y z
1  1  1 1
2  2  2 2
3  1 99 1
4 98  1 1
like image 33
ThomasIsCoding Avatar answered Oct 28 '25 08:10

ThomasIsCoding



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!