Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split one variable into multiple variables in R

I am relatively new to R. My question isn't entirely as straightforward as the title. This is a sample of what df looks like:

id    amenities
1     wireless internet, air conditioning, pool, kitchen
2     pool, kitchen, washer, dryer
3     wireless internet, kitchen, dryer
4     
5     wireless internet

this is what i want df to look like:

id    wireless internet   air conditioning   pool   kitchen   washer   dryer
1     1                   1                  1      1         0        0
2     0                   0                  1      1         1        1
3     1                   0                  0      1         0        1
4     0                   0                  0      0         0        0
5     1                   0                  0      0         0        0

sample code to reproduce data

df <- data.frame(id = c(1, 2, 3, 4, 5),
      amenities = c("wireless internet, air conditioning, pool, kitchen",  
                    "pool, kitchen, washer, dryer", 
                    "wireless internet, kitchen, dryer", 
                    "", 
                    "wireless internet"), 
      stringsAsFactors = FALSE)
like image 858
sweetmusicality Avatar asked Jan 18 '26 05:01

sweetmusicality


2 Answers

FWIW, here's a base R approach (assuming that df contains your data as shown in the question)

dat <- with(df, strsplit(amenities, ', '))
df2 <- data.frame(id = factor(rep(df$id, times = lengths(dat)),
                              levels = df$id),
                  amenities = unlist(dat))
df3 <- as.data.frame(cbind(id = df$id,
                     table(df2$id, df2$amenities)))

This results in

> df3
  id air conditioning dryer kitchen pool washer wireless internet
1  1                1     0       1    1      0                 1
2  2                0     1       1    1      1                 0
3  3                0     1       1    0      0                 1
4  4                0     0       0    0      0                 0
5  5                0     0       0    0      0                 1

Breaking down what is going on:

  1. dat <- with(df, strsplit(amenities, ', ')) splits the amenities variable on ', ', resulting in

    > dat
    [[1]]
    [1] "wireless internet" "air conditioning"  "pool"             
    [4] "kitchen"          
    
    [[2]]
    [1] "pool"    "kitchen" "washer"  "dryer"  
    
    [[3]]
    [1] "wireless internet" "kitchen"           "dryer"            
    
    [[4]]
    character(0)
    
    [[5]]
    [1] "wireless internet"
    
  2. The second line takes dat and turns it into a vector, and we add on and id column by repeating the origina id values as many times as the number of amenities for that id. This results in

    > df2
       id         amenities
    1   1 wireless internet
    2   1  air conditioning
    3   1              pool
    4   1           kitchen
    5   2              pool
    6   2           kitchen
    7   2            washer
    8   2             dryer
    9   3 wireless internet
    10  3           kitchen
    11  3             dryer
    12  5 wireless internet
    
  3. Use the table() function to create the contingency table and then we add on an id column.

like image 69
Gavin Simpson Avatar answered Jan 19 '26 20:01

Gavin Simpson


A solution using dplyr and tidyr. Notice that I replace "" with None because it is easier to process the column names later.

library(dplyr)
library(tidyr)

df2 <- df %>%
  separate_rows(amenities, sep = ",") %>%
  mutate(amenities = ifelse(amenities %in% "", "None", amenities)) %>%
  mutate(value = 1) %>%
  spread(amenities, value , fill = 0) %>%
  select(-None)
df2
#   id  air conditioning  dryer  kitchen  pool  washer pool wireless internet
# 1  1                 1      0        1     1       0    0                 1
# 2  2                 0      1        1     0       1    1                 0
# 3  3                 0      1        1     0       0    0                 1
# 4  4                 0      0        0     0       0    0                 0
# 5  5                 0      0        0     0       0    0                 1
like image 25
www Avatar answered Jan 19 '26 18:01

www



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!