Suppose I have a dataframe:
df <- data.frame(
x = c("Yes","No","Yes","Don't Know"),
y = c("Male","Female","Refused","Male"),
z = c("Employed", "Unemployed","Employed","Employed")
)
I want to recode the values in the entire dataframe to numeric values and have the codes stored in a separate dataframe:
codes <- data.frame(
variable = c(rep("x",3),rep("y",3),rep("z",2)),
labels = c("Yes","No","Don't Know","Male","Female","Refused","Employed", "Unemployed"),
codes = c(1,2,98,1,2,99,1,2)
)
I want to output to look like:
> df
x y z
1 1 1 1
2 2 2 2
3 1 99 1
4 98 1 1
How can I approach this problem? A tidyverse solution would be very helpful.
An option would be to bring the data in long format join with codes dataset and move it back to wide format.
Using tidyverse this can be done as following :
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = everything()) %>%
left_join(codes, join_by(name == variable, value == labels)) %>%
select(-value) %>%
pivot_wider(names_from = name, values_from = codes, values_fn = list) %>%
unnest_longer(everything())
# A tibble: 4 × 3
# x y z
# <dbl> <dbl> <dbl>
#1 1 1 1
#2 2 2 2
#3 1 99 1
#4 98 1 1
In tidyverse, you can try
df %>%
mutate(across(everything(), ~{with(
filter(codes, variable == cur_column()),
codes[match(.x, labels)]) }))
which gives
x y z
1 1 1 1
2 2 2 2
3 1 99 1
4 98 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With