I want to delete all instances of "aA", "bB" ... "zZ" from an input string.
e.g.
echo "foObar" |
sed -Ee 's/([a-z])\U\1//'
should output "fbar"
But the \U syntax works in the latter half (replacement part) of the sed expression - it fails to resolve in the matching clause.
I'm having difficulty converting the matched character to upper case to reuse in the matching clause.
If anyone could suggest a working regex which can be used in sed (or awk) that would be great.
Scripting solutions in pure shell are ok too (I'm trying to think of solving the problem this way).
Working PCRE (Perl-compatible regular expressions) are ok too but I have no idea how they work so it might be nice if you could provide an explanation to go with your answer.
Unfortunately, I don't have perl or python installed on the machine that I am working with.
You may use the following perl solution:
echo "foObar" | perl -pe 's/([a-z])(?!\1)(?i:\1)//g'
See the online demo.
Details
([a-z]) - Group 1: a lowercase ASCII letter(?!\1) - a negative lookahead that fails the match if the next char is the same as captured with Group 1(?i:\1) - the same char as captured with Group 1 but in the different case (due to the lookahead before it).The -e option allows you to define Perl code to be executed by the compiler and the -p option always prints the contents of $_ each time around the loop. See more here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With