Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a columns into separate columns in wide format in R

Tags:

r

tidyr

tidyverse

I am trying to convert the "FRUITS" column into separate columns ("Apple" and "Banana") in wide format.

  Gender  AgeGroup              EAT           FRUITS 
1 Female 30yr_39yr              Yes             Apple           
2 Female 20yr_29yr              Yes             Apple              
3 Female 70yr_80yr              Yes             Apple             
4   Male 50yr_59yr              Yes             Banana              
5 Female 40yr_49yr              Yes             Apple                  
6 Female 70yr_80yr              Yes             Apple              

how to convert FRUITS column into:

  Gender AgeGroup       EAT  Apple      Banana 
1 Female 30yr_39yr      Yes  TRUE      FALSE
2 Female 20yr_29yr      Yes  TRUE      FALSE
3 Female 70yr_80yr      Yes  TRUE      FALSE
4   Male 50yr_59yr      Yes  FALSE     TRUE
5 Female 40yr_49yr      Yes  TRUE      FALSE
6 Female 70yr_80yr      Yes  TRUE      FALSE

Here is the dataframe I used:

data.frame(
  Gender = c("Female", "Female", "Female", "Male", "Female", "Female"),
  AgeGroup = c("30yr_39yr", "20yr_29yr", "70yr_80yr", "50yr_59yr", "40yr_49yr", "70yr_80yr"),
  EAT = c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
  FRUITS = c("Apple", "Apple", "Apple", "Banana", "Apple", "Apple")
)
like image 413
01200 Avatar asked Oct 24 '25 17:10

01200


2 Answers

df[c("Apple", "Banana")] <- list(df$FRUITS == "Apple", df$FRUITS == "Banana")

#   Gender  AgeGroup EAT FRUITS Apple Banana
# 1 Female 30yr_39yr Yes  Apple  TRUE  FALSE
# 2 Female 20yr_29yr Yes  Apple  TRUE  FALSE
# 3 Female 70yr_80yr Yes  Apple  TRUE  FALSE
# 4   Male 50yr_59yr Yes Banana FALSE   TRUE
# 5 Female 40yr_49yr Yes  Apple  TRUE  FALSE
# 6 Female 70yr_80yr Yes  Apple  TRUE  FALSE

To generalise to more value you could use:

cols <- c("Apple", "Banana")
df[cols] <- lapply(cols, \(x) df$FRUITS == x)
like image 73
sindri_baldur Avatar answered Oct 26 '25 09:10

sindri_baldur


To reshape to wide using tidyr::pivot_wider you have to add a value column to your dataset and a column with a unique id for each row:

df <- data.frame(
  Gender = c("Female", "Female", "Female", "Male", "Female", "Female"),
  AgeGroup = c("30yr_39yr", "20yr_29yr", "70yr_80yr", "50yr_59yr", "40yr_49yr", "70yr_80yr"),
  EAT = c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes"),
  FRUITS = c("Apple", "Apple", "Apple", "Banana", "Apple", "Apple")
)

library(tidyr)
library(dplyr, warn = FALSE)

df |>
  mutate(
    value = TRUE,
    id = row_number()
  ) |>
  pivot_wider(
    names_from = FRUITS,
    values_from = value, values_fill = FALSE
  ) |>
  select(-id)
#> # A tibble: 6 × 5
#>   Gender AgeGroup  EAT   Apple Banana
#>   <chr>  <chr>     <chr> <lgl> <lgl> 
#> 1 Female 30yr_39yr Yes   TRUE  FALSE 
#> 2 Female 20yr_29yr Yes   TRUE  FALSE 
#> 3 Female 70yr_80yr Yes   TRUE  FALSE 
#> 4 Male   50yr_59yr Yes   FALSE TRUE  
#> 5 Female 40yr_49yr Yes   TRUE  FALSE 
#> 6 Female 70yr_80yr Yes   TRUE  FALSE
like image 40
stefan Avatar answered Oct 26 '25 07:10

stefan