Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I match the lowercase version of a backreference

Tags:

regex

grep

I'd like to match the lowercase version of an uppercase character in a backreference in a regex. For example, let's say I want to match a string where the 1st character is any uppercase character and the 4th character is the same letter as the first except it's a lowercase character. If I use grep with this regex:

grep -E "([A-Z])[a-z]{2}\1[a-z]"

it would match "EssEx" and "SusSe" for instance. I'd like to match "Essex" and "Susse" instead. Is it possible to modify the above regular expression to achieve this ?

like image 295
Manos Nikolaidis Avatar asked Sep 20 '25 13:09

Manos Nikolaidis


1 Answers

This is one of the cases where inline modifiers come in handy. Here is a solution that makes use of a case-senstive lookahead to check, that it is not exactly the same (uppercase) character and a case-insensitive backreference to match the fitting lowercase letter:

([A-Z])[a-z]{2}(?-i)(?!\1)(?i)\1[a-z]

Note that the (?-i) most likely isn't needed, but it's there for clarity. Inline modifiers are not supported by all regex flavours. PCRE supports it, so you will have to use -P with grep.

like image 194
Sebastian Proske Avatar answered Sep 23 '25 14:09

Sebastian Proske