In a dataset i have a column containing letters, as follows:
d = data.frame(col1 = c("ABC", "CDE","ACE","BDF"))
d
col1
1 ABC
2 CDE
3 ACE
4 BDF
I would like to create a column for each different letter that is contained in col1 and those column would be TRUE/FALSE, as follows:
col1 A
1 ABC TRUE
2 CDE FALSE
3 ACE TRUE
4 BDF FALSE
But the thing is that i have 25 different ones. So to identify each different characters that is contained in the column i already have the function needed :
find.characters <- function(v1){
x1 <- unique(unlist(strsplit(v1, '')))
indx <- grepl('[A-Z]', x1)
c(sort(x1[indx]), sort(x1[!indx]))
}
find.characters(d$col1)
[1] "A" "B" "C" "D" "E" "F"
But i struggle to create a list of columns based on this list of characters
You can take advantage of the LETTERS built-in character vector to create the column names and apply grepl on each of its elements:
d[LETTERS] <- sapply(LETTERS, \(l) grepl(l, d$col1))
output
# col1 A B C D E F G H I J K L M N O P Q R
# 1 ABC TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 2 CDE FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 3 ACE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 4 BDF FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# S T U V W X Y Z
# 1 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 2 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With