pivotting a list of data frames and merging them

Question

I have a list of 3 dataframes that share some rows and columns

DATA

all_data <- list(questionnaireA = structure(list(name = structure(2:1, .Label = c("James", 
          "Shawn"), class = "factor"), banana = c(1, 0), grapes = c(1, 
          1), orange = c("AB", 1)), class = "data.frame", row.names = c(NA, 
          -2L)), questionnaireB = structure(list(name = structure(2:1, .Label = c("Chris", 
          "James"), class = "factor"), orange = c(1, 0), banana = c(1, 
          0)), class = "data.frame", row.names = c(NA, -2L)), questionnaireC = structure(list(
          name = structure(3:1, .Label = c("Donald", "James", "Shawn"
          ), class = "factor"), banana = c(1, 0, 0), raisins = c(1, 
          1, 1), grapes = c(1, 1, 0), cake = c(0, 1, 0)), class = "data.frame", row.names = c(NA, -3L)))

$questionnaireA
   name banana grapes orange
1 Shawn      1      1     AB
2 James      0      1      1

$questionnaireB
   name orange banana
1 James      1      1
2 Chris      0      0

$questionnaireC
    name banana raisins grapes cake
1  Shawn      1       1      1    0
2  James      0       1      1    1
3 Donald      0       1      0    0

library(tidyverse)
map(all_data, ~ .x %>%
    pivot_longer(cols=-name, names_to="fruit"))

Am not sure how to rename the value to the name of the dataframe.
I don't know how to join the values and union the name fruit pair.

Any help would be appreciated!

akrun · Accepted Answer

If we follow the similar option as in OP's attempted method, i.e. reshape to 'long' format each of the datasets in the list, then loop with imap, create the name of the list1 element as new column, use pivot_longer to reshape into 'long' format, then create a sequence column by group and reshape into 'wide' format with pivot_wider

library(dplyr)
library(tidyr)
library(purrr)
imap_dfr(all_data, ~
                   .x %>% 
                      mutate(grp = .y) %>%
                      pivot_longer(cols = -c(name, grp), 
                         names_to = "fruit", values_to = "Value")) %>% 
   #group_by(name, grp, fruit) %>% 
   #mutate(rn = row_number()) %>%
   pivot_wider(names_from = grp, values_from = Value)

Or do this more efficiently, by binding all the datasets to a single data with bind_cols, do the pivot_longer while removing the missing values with value_drop_na = TRUE and then do the same as in the above solution

bind_rows(all_data, .id = 'grp') %>%
    pivot_longer(cols = c(-name, -grp), names_to = "fruit", 
         values_to = "Value", values_drop_na = TRUE) %>%
    # sequence column creation is not really required for the example
    # as there are no duplicates
    #group_by(name, grp, fruit) %>% 
    #mutate(rn = row_number()) %>%
    pivot_wider(names_from = grp, values_from = Value)

Update

Based on the new data with some mix in column types, if we need to have the values like "AB" remain as such, it needs to be converted to character class

imap_dfr(all_data, ~
               .x %>% 
                  mutate_at(-1, as.character) %>% 
                  mutate(grp = .y) %>% 
                  pivot_longer(cols = -c(name, grp), names_to = "fruit",
                    values_to = "Value")) %>% 
       pivot_wider(names_from = grp, values_from = Value)

Or an efficient approach similar to the bind_rows earlier (but it can't be done here because the column types are different)

map_dfr(all_data, ~
         .x %>%
            mutate_at(-1, as.character), .id = 'grp') %>%  
    pivot_longer(cols = c(-name, -grp), names_to = "fruit", 
          values_to = "Value", values_drop_na = TRUE)  %>%
    pivot_wider(names_from = grp, values_from = Value)

pivotting a list of data frames and merging them

Tags:

list

r

pivot

tidyverse

Shahin

1 Answers

Update

akrun

Recent Activity

Donate For Us

pivotting a list of data frames and merging them

Tags:

list

r

pivot

tidyverse

Shahin

1 Answers

Update

akrun

Related questions

Recent Activity

Donate For Us