Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Perl RegExp in R

Tags:

regex

r

I have a string from which I'm trying to extract the term preceeding a keyword.

str = "This is a <Keyword>(-)Controlled design"

There can be a space between keyword and controlled or a "-". I need to extract the before "Controlled". In Perl, I'm using the below regular expression:

/(\w+)[- ]controlled/i) 

I am trying the same in R after handling the backslashes and setting perl=TRUE. But it doesn't work. How can I use this expression to extract the in R? Is there a an alternate expression/library that I can use?

Thanks in advance, simak

like image 264
BRZ Avatar asked Jan 26 '26 18:01

BRZ


1 Answers

Would something like this be good enough using gsub?

str <- "This is a keyword-Controlled design"

gsub("(.+\\s)?(\\w+)(\\s|-)(Controlled).+","\\2",str)
#[1] "keyword"

gsub("(.+\\s)?(\\w+)(\\s|-)(Controlled).+","\\2",str)
#[1] "keyword"

And because regex is not the be all and end all:

spl <- unlist(strsplit(str,"[-| ]"))
spl[which(spl=="Controlled")-1]
#[1] "keyword"
like image 176
thelatemail Avatar answered Jan 28 '26 06:01

thelatemail



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!