Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I match exact word boundary, but excluding special characters in front and back?

I want the regex in tcl to match my specified word boundary, without any special characters such as +-.() in front or back.

Here are the things I tried and it just doesn't match properly:

Let's say I have the following string:

hello world +hello world -hello world hello+ hello

I want it to only match hello, not hello+ or -hello.

\bhello\b
  • hello
  • +hello
  • -hello
  • hello+
[^+-]\bhello\b[^+-]
  • no matches
[^+-]\bhello\b
  • (doesn't match the first hello even though it should've matched)
  • hello+
  • hello
(?![+-])\bhello\b(?![+-])
  • hello
  • +hello
  • -hello
like image 674
adrive Avatar asked Dec 06 '25 04:12

adrive


2 Answers

As documented, Tcl uses \y to match a word boundary, not \b (which is a backspace character for compatibility with the escapes used by general Tcl code). This means you need an RE something like this:

(?:^|[^-+])\yhello\y(?:$|[^-+])

The middle piece is \yhello\y which matches the word, and then we need ^|[^-+] at the beginning to match either the beginning of the string or a character other than - or +, and equivalently $|[^-+] for the end. (I put those in (?:…) just to limit the scope of the | RE operator.)

Demonstrating from an interactive session:

% set RE {(?:^|[^-+])\yhello\y(?:$|[^-+])}
(?:^|[^-+])\yhello\y(?:$|[^-+])
% regexp $RE "hello"
1
% regexp $RE "ahello"
0
% regexp $RE "+hello"
0
% regexp $RE "+ hello"
1
% regexp $RE "hello+"
0
% regexp $RE "hello-"
0
% regexp $RE "hello.-"
1
like image 190
Donal Fellows Avatar answered Dec 07 '25 20:12

Donal Fellows


This regex matches the word hello, allows spaces before and after, but doesn't allow anything else. Creating a boundary of word characters AND special characters.

(?<!\S)hello(?!\S)

This uses the "negative look-ahead" and "negative look-behind" syntax.

(?<!\S): Look behind and make sure there aren't any non-whitespace characters.

(?!\S): Look ahead and make sure there aren't any non-whitespace characters.

like image 21
Nova Avatar answered Dec 07 '25 18:12

Nova