Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Negative Lookbehind with a variable number of characters

I know there are a lot of regex and negative lookbehind questions but I have one that I cannot find an answer to. I want to find instances of water but not if it has never in front of it with a variable number of characters between the two. There is an infinite number of variable characters between these two words and lookbehind does not allow for variable characters. I have code that will find never but it will find never at the very start of the script. Is there a way to limit a lookbehind to only 20 or 30 characters? What I have:

(?i)^(?=.*?(?:water))(?:(?!never).)*$

Just some of the examples I am working with:

water                                                         (match)
I have water                                                  (match)
I never have water
Where is the water.                                           (match)
I never have food or water
I never have food but I always have water                     (match)
I never have food or chips. I like to walk. I have water      (match)

Again, the problem is that I could have a paragraph that is 10 sentences long and if it has never any where in there it will not find water and that lookbehind and lookahead does not accept variable characters. I appreciate any help you could give.

like image 408
Shawn Jamal Avatar asked Nov 07 '25 01:11

Shawn Jamal


2 Answers

You can use this regex in Python's builtin re module:

(?i)^(?!.*\bnever\b.{,20}\bwater\b).*\bwater\b

RegEx Demo

RegEx Details:

  • (?i): Enable ignore case mode
  • ^: Start
  • (?!.*\bnever\b.{,20}\bwater\b): Negative lookahead condition. This will fail the match if word never appears within 20 characters before word water.
  • .*\bwater\b: Find word water anywhere in the line
like image 87
anubhava Avatar answered Nov 09 '25 16:11

anubhava


Negative lookbehind with variable number of characters is not supported in Python. What you can do is check if "never is before water", and return False in that case. For eg:

def test(string):
    if re.match('.*never.*water.*', string):
        return False
    elif re.match('.*water.*', string):
        return True
    else:
        # return False?
        return False
like image 25
shubham Avatar answered Nov 09 '25 16:11

shubham