How to make sure that part of the pattern (keyword in this case) is in the pattern you're looking for, but it can appear in different places. I want to have a match only when it occurs at least once.
Regex:
\b(([0-9])(xyz)?([-]([0-9])(xyz)?)?)\b
We only want the value if there is a keyword: xyz
Examples:
1. 1xyz-2xyz - it's OK
2. 1-2xyz - it's OK
3. 1xyz - it's OK
4. 1-2 - there should be no match, at least one xyz missing
I tried a positive lookahead and lookbehind but this is not working in this case.
You can make use of a conditional construct:
\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))
See the regex demo. Details:
\b
- word boundary([0-9])
- Group 1: a digit(xyz)?
- Group 2: an optional xyz
string(?:-([0-9])(xyz)?)?
- an optional sequence of a -
, a digit (Group 3), xyz
optional char sequence\b
- word boundary(?(2)|(?(4)|(?!)))
- a conditional: if Group 2 (first (xyz)?
) matched, it is fine, return the match, if not, check if Group 4 (second (xyz)?
) matched, and return the match if yes, else, fail the match.See the Python demo:
import re
text = "1. 1xyz-2xyz - it's OK\n2. 1-2xyz - it's OK\n3. 1xyz - it's OK\n4. 1-2 - there should be no match"
pattern = r"\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))"
print( [x.group() for x in re.finditer(pattern, text)] )
Output:
['1xyz-2xyz', '1-2xyz', '1xyz']
Indeed you could use a lookahead in the following way:
\b\d(?:xyz|(?=-\dxyz))(?:-\d(?:xyz)?)?\b
See this demo at regex101 (or using ^
start and $
end)
The first part matches either an xyz
OR (if there is none) the lookahead ensures that the xyz
occures in the second optional part. The second part is dependent on the previous condition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With