Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split character by multiple criteria in R

I have a vector like that:

c("variable1+variable2 + variable3*variable4+ variable5")

I would like to split his string in a vector like:

c("variable1", "variable2", "variable3", "variable4", "variable5")

IMPORTANT 1: note that there are two kind of separators; + and *. IMPORTANT 2: note that sometimes there are a blank space between the word I wanna get and the separator, and other times there are not blank spaces.

like image 604
Miquel Avatar asked Oct 27 '25 05:10

Miquel


2 Answers

You can use stringr package with

library(stringr)
a <- c("variable1+variable2 + variable3*variable4+ variable5")

str_split(str_squish((str_replace_all(a, regex("\\W+"), " "))), " ")

Output:

[1] "variable1" "variable2" "variable3" "variable4" "variable5"
like image 70
TarJae Avatar answered Oct 29 '25 19:10

TarJae


In base R, we can use strsplit

out <- strsplit("variable1+variable2 + variable3*variable4+ variable5", 
          "\\s*[*+]\\s*")[[1]]

-output

out
[1] "variable1" "variable2" "variable3" "variable4" "variable5"

The structure is

dput(out)
c("variable1", "variable2", "variable3", "variable4", "variable5"
)
like image 30
akrun Avatar answered Oct 29 '25 21:10

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!