I want to match all the literals in the form literal ( -- i.e. : Literal followed by space, then opening parenthesis. But if the literal is either of "hi" or "hello" or "bye", then it should not be matched.
So I am looking for the following result :
Literal :: Result
--------------------------------
Hello ( :: Match
There ( :: Match
hello ( :: Not Match
New ( :: Match
hi ( :: Not Match
I am trying to do it by lookahead regex. So I put like
(^|\s)(?!((hello|hi|bye)(\s\()))
But its matching all.
And I can't do it by lookbehind as it doesn't take regex expression.
Is there any regex to do this task?
UPDATE
I'm trying with perl and checkstyle (Don't know which flavor checkstyle uses).
The lookahead is giving Match for both.
But in lookbehind, Perl is giving error Variable length lookbehind not implemented in regex m/(?<!(hello|hi|bye))\s\(/, whereas in checkstyle I'm getting desired result.
Your regex doesn't work because it will always match the space between the literal and the ( (since a space matches (^|\s) and ( doesn't match ((hello|hi|bye)(\s\())). And it should also match spaces at many other places.
Test to show what yours matches.
This regex should work:
\b(?!(?:hello|hi|bye)\s)\w+\s\(
Test for this regex.
Explanation:
\b - word boundary.
(?!(?:hello|hi|bye)\s) - negative look-ahead for hello, hi or bye followed by a space.
It's followed by a space so we match byelo (, remove it if this is not desired.
(?:hello|hi|bye) as opposed to simply (hello|hi|bye) just makes it a non-capturing group, it doesn't change the output.
\w+ - one or more word characters (word characters are typically [A-Za-z0-9_]).
\s - a space.
\( - a bracket.
If you are using a perl-compatible regex engine you should be able to use a zero-width negative lookbehind assertion like this...
(?<!hello|hi|bye) \(
An example using R (with perl-compatability switched on)...
string <- c( "hello (" , "hi (" , "bye (" , "Hello (" , "Anything (" )
grepl( pattern = "(?<!hello|hi|bye) \\(" , string , perl = TRUE )
[1] FALSE FALSE FALSE TRUE TRUE
We can be a bit more precise like so....
^.+(?<!^hello|^hi|^bye)\s\(
Matching the start of the string, then optionally any characters, but not hello, hi or bye at the start of a string, then a space, then an open parentheses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With