Don't know how to use lookarounds properly to achieve my Regex match

Question

I'm writing a perl script and part of it requires that I match all occurrences of a certain pattern in a string. Naturally, a regular expression seems like it would be powerful enough, but I just can't get it right for this particular string.

A hypothetical example of the type of text the regex might be applied to would be:

1cat;2dog;!3monkey;!4horse;

As you can see, several data entries (1cat, 2dog, etc.) are present in the line, delimited by semicolons. The beginning of the line contains no semicolon, but the end does. I want to be able to match all the stuff which hasn't been not'ed by the !. In the above example, 1cat and 2dog would be matched and returned in list context, while 3monkey and 4horse would not.

What I have tried to do so far is use negative lookbehinds to notice only the entries without a !. Something like this:

m/(?<!\!)(\w+)\;/g

However, doesn't work because the for every !'ed entry, the regex just matches what comes after it, up to the semicolon. In the example, 1cat and 2dog are captured, but then so are monkey and horse.

I feel like this is easily doable, but I'm new to regular expressions and I can't think of anything else.

Sam · Accepted Answer

Throw a word boundary (\b) in there and you should be good:

(?<!!)\b(\w+);

As you could tell your negative lookbehind was working, but it would still match everything after the next character (horse from !4horse). A word boundary is a zero-width assertion, kind of like a conditional that doesn't match anything (like anchors ^ and $). It asserts for this: (^\w|\w\W|\W\w|\w$). In other words, anytime a word character ([a-zA-Z0-9_]) is next to the beginning/end of string or a non-word character.

Don't know how to use lookarounds properly to achieve my Regex match

Tags:

regex

delimiter

perl

regex-lookarounds

DDP

1 Answers

Sam

Recent Activity

Donate For Us

Don't know how to use lookarounds properly to achieve my Regex match

Tags:

regex

delimiter

perl

regex-lookarounds

DDP

1 Answers

Sam

Related questions

Recent Activity

Donate For Us