Can anyone suggest a regex to match the underscore in the following examples:
test_test
test[_test
test_]
But NOT match this:
test[_]test
This is using the .Net Regular Expression library. I'm using this RegEx tester to check:
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
Try this:
_[^\]]|[^[]_
It consists of an alternation of _[^\]] (underscore and not ]) and [^[]_ (not [ and underscore).
Or if you want to use look-around assertions to really match just the underscore and not surrounding characters:
_(?=[^\]])|_(?<=[^[]_)
This matches any underscore that is not followed by a ] ((?=[^\]]), positive look-ahead) or any underscore that is not preceded by a [ ((?<=[^[]_), negative look-behind). And this can be combined to:
_(?:(?=[^\]])|(?<=[^[]_))
_(?!\](?<=\[_\]))
If the underscore isn't followed by a closing bracket, the negative lookahead succeeds immediately. Otherwise, it does a lookbehind to find out if the underscore is also preceded by an opening bracket. You can replace the "_]" with dots to make it clear that you're only interested in the opening bracket this time:
_(?!\](?<=\[..))
You can do the lookbehind first if you want:
_(?<!\[_(?=\]))
The important thing is that the second lookaround has to be nested within the first one in order to achieve the "NOT (x AND y)" semantics.
Testing it in EditPad Pro, it matches the underscore in all but the last of these strings:
test_test
test[_test
test_]
_]Test
Test[_
test[_]test
EDIT: here's an easier-to-read version:
(?<!\[)_|_(?!\])
What I like about the nested-lookaround version is that it doesn't do anything until it actually finds an underscore. Unless the regex engine is smart enough optimize it away, this "(NOT x) OR (NOT y)" version will do a negative lookbehind at every single position.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With