Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging multiple columns and filling in NA answers

I have a dataset with over 14000 observations and 43 variables. The data was collected across 11 countries and for two of the questions, participants were asked different variations of the same question based on the country they were in, meaning that for 2 variables I actually have 22 columns. Basically, here is an example of what the df looks like:

df <- data-frame(country = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Q1_UK = c(1, 2, 2, NA, NA, NA, NA, NA, NA), Q1_FR = c(NA, NA, NA, 2, 1, 2, NA, NA, NA), Q1_ES = c(NA, NA, NA, NA, NA, NA, 2, 2, 1), Q2_UK = c(1, 1, 2, NA, NA, NA, NA, NA, NA), Q2_FR = c(NA, NA, NA, 1, 2, 2, NA, NA, NA), Q2_ES = c(NA, NA, NA, NA, NA, NA, 1, 2, 1))


   country  Q1_UK Q1_FR Q1_ES Q2_UK Q2_FR Q2_ES
1       1       1    NA    NA     1    NA    NA
2       1       2    NA    NA     1    NA    NA
3       1       2    NA    NA     2    NA    NA
4       2      NA     2    NA    NA     1    NA
5       2      NA     1    NA    NA     2    NA
6       2      NA     2    NA    NA     2    NA
7       3      NA    NA     2    NA    NA     1
8       3      NA    NA     2    NA    NA     2
9       3      NA    NA     1    NA    NA     1

and so on...

I want to have 2 single variables containing all responses for different countries - with an end result like this:

  country Q1 Q2
1       1  1  1
2       1  2  1
3       1  2  2  
4       2  2  1 
5       2  1  2 
6       2  2  2 
7       3  2  1 
8       3  2  2
9       3  1  1

I was thinking that rotating the dataframe, using fill(), and then rotating again might work but I was not too sure how to go about it and how to make sure that the answers are only filled in by question and not across variables. I am really new to R and I am exhausted so I might just be missing something obvious.

like image 827
ouroboro Avatar asked Nov 27 '25 19:11

ouroboro


1 Answers

This could be done with pivot_longer

library(tidyr)
pivot_longer(df, cols = -country, names_to = c(".value"),
    names_pattern = "(.*)_.*", values_drop_na = TRUE)

-output

 A tibble: 9 × 3
  country    Q1    Q2
    <int> <int> <int>
1       1     1     1
2       1     2     1
3       1     2     2
4       2     2     1
5       2     1     2
6       2     2     2
7       3     2     1
8       3     2     2
9       3     1     1
like image 107
akrun Avatar answered Nov 30 '25 09:11

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!