I'm trying to use dplyr to split a string into a comma separated string and I'm not having much luck.
dat<-data.frame(key=1:4,labels=c('a','ab','abc','b'))
I'm trying to get the labels column to be c('a','a,b','a,b,c','b')
I've tried all of the below variations but nothing seems to work.
dat %>%
mutate(labels=str_split(labels,''))
dat %>%
mutate(labels=str_split(labels,'')[[1]])
dat %>%
mutate(labels=paste(str_split(labels,''),collapse=','))
dplyr or mutate has nothing to do with your question. Your problems are more along the lines of trying to treat a list (returned by str_split) as a vector.
I would write a little function to do it:
comma_sep = function(x) {
x = strsplit(as.character(x), "")
unlist(lapply(x, paste, collapse = ','))
}
You can then
mutate(dat, labels = comma_sep(labels))
# key labels
# 1 1 a
# 2 2 a,b
# 3 3 a,b,c
# 4 4 b
But of course you could jam the meat of the function into that one line as well.
Replace each non-boundary with a comma like this:
dat %>% mutate(labels = gsub("\\B", ",", labels, perl = TRUE))
or with a slightly more complex regex but without perl=TRUE, replace each character that is followed by a non-boundary with that character followed by comma:
dat %>% mutate(labels = gsub("(.)\\B", "\\1,", labels))
Either one gives:
key labels
1 1 a
2 2 a,b
3 3 a,b,c
4 4 b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With