I have a vector of strings:
s <- c('abc1',   'abc2',   'abc3',   'abc11',   'abc12', 
       'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12', 
       'nonsense')
I would like a regular expression to match only the strings that begin with abc and end with 3, 11, or 12.  In other words, the regex has to exclude abc1 but not abc11, abc2 but not abc12, and so on.
I thought that this would be easy to do with lookahead assertions, but I haven't found a way. Is there one?
EDIT: Thanks to posters below for pointing out a serious ambiguity in the original post.
In reality, I have many strings.  They all end in digits: some in 0, some in 9, some in the digits in between.  I am looking for a regex that will match all strings except those that end with a letter followed by a 1 or a 2.  (The regex should also match only those strings that start with abc, but that's an easy problem.)
I tried to use negative lookahead assertions to create such a regex. But I didn't have any success.
Thanks to all who replied and commented.  Inspired by several of you, I ended up using this combination: grepl('^abc', s) & !grepl('[[:lower:]][12]$', s).
Instead of one complicated regular expression, in this case I think it's easier to use two simple regular expressions:
s <- c('abc1',   'abc2',   'abc3',   'abc11',   'abc12', 
       'abcde1', 'abcde2', 'abcde3', 'abcde11', 'abcde12', 
       'nonsense')
s[grepl("^abc", s) & grepl("(3|11|12)$", s)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With