Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string with non greedy regex via strsplit

Tags:

regex

r

strsplit

I am facing a problem with regex and strsplit. I would like to split the following x string based on the second : symbol

x <- "26/11/19, 22:16 - Super Mario: It's a me: Super Mario!, but also : the princess"

and obtain then something like this

"26/11/19, 22:16 - Super Mario"
" It's a me: Super Mario!, but also : the princess"

I am using by using strsplit with the following regular expression that in based on my little know-how should reason like "select ONLY the colon symbol followed by a space and preceded by ONLY letters".

I tried to make the regex non greedy with the ? symbol but clearly I am missing something and the result does not work as expected because it includes also me: in the splitting operation.

It is essential I think to have a non greedy operator, because the string here is just an example I do not have always the word Mario of course.

strsplit(x, "(?<=[[:alpha:]]):(?= )", perl = TRUE)

Thank you in andvance!

like image 753
SabDeM Avatar asked Dec 23 '25 02:12

SabDeM


1 Answers

We can replace the first occurrence of ':' by another character or just replicate it and then use strsplit

strsplit(sub("([[:alpha:]]):", "\\1::", x),
       "(?<=[[:alpha:]]):{2,}(?= )", perl = TRUE)[[1]]
#[1] "26/11/19, 22:16 - Super Mario"       
#[2] " It's a me: Super Mario!, but also : the princess"

Or with str_split

library(stringr)
str_split(x, "(?<=[[:alpha:]]):(?= )", n = 2)[[1]]
#[1] "26/11/19, 22:16 - Super Mario"   
#[2] " It's a me: Super Mario!, but also : the princess"
like image 188
akrun Avatar answered Dec 24 '25 17:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!