how to group rows in an unique row for unique column values?

Question

Suppose the next data frame:

df <- data.frame(a=c('A1', 'A1', 'A1', 'A2', 'A2', 'A3'), 
                 b=c('a', 'b', 'c', 'd', 'e', 'f'), c=rep(1, 6))

I am trying to group based on unique values in a column such that expected data frame could look like this:

# a    b        c   
# A1   [a,b,c]  1
# A2   [d,e]    1
# A3   [f]      1

How could I accomplish this task?

akrun · Accepted Answer

We could group by 'a', 'c', summarise the unique elements to 'b' in a string

library(dplyr)
df %>% 
   group_by(a, c) %>% 
   summarise(b = sprintf('[%s]', toString(unique(b))), .groups = 'drop') %>%
   select(names(df))

-output

# A tibble: 3 x 3
#  a     b             c
#  <chr> <chr>     <dbl>
#1 A1    [a, b, c]     1
#2 A2    [d, e]        1
#3 A3    [f]           1

Or if the 'c' values are also changing, use across

df %>%
  group_by(a) %>%
  summarise(across(everything(), ~ sprintf('[%s]', 
       toString(unique(.)))), .groups = 'drop')

Or if we need a list

df %>%
  group_by(a) %>%
  summarise(across(everything(), ~ list(unique(.))
         
    ), .groups = 'drop')

Or using glue

df %>%
   group_by(a, c) %>%
   summarise(b = glue::glue('[{toString(unique(b))}]'), .groups = 'drop')

-output

# A tibble: 3 x 3
#  a         c b        
#* <chr> <dbl> <glue>   
#1 A1        1 [a, b, c]
#2 A2        1 [d, e]   
#3 A3        1 [f]

Donate For Us