So let's say I have a list of strings which sometimes end with a phrase that has been cut off to different lengths. In this example the phrase is "hello".
my @strings =
(
"Test 1 hello",
"Something else",
"Test 2 hell",
"And also he",
"Test 4 hel"
);
This is how I would remove the "hello" fragments right now:
foreach my $string (@strings)
{
if ($string =~ m/(.*?)\s*(h(e(l(lo?)?)?)?)?$/)
{
print "'", $string, "' -> '", $1, "'\n";
}
}
It does work:
'Test 1 hello' -> 'Test 1'
'Something else' -> 'Something else'
'Test 2 hell' -> 'Test 2'
'And also he' -> 'And also'
'Test 4 hel' -> 'Test 4'
However, I find the regular expression to match all the "hello" fragments long, confusing and hard to modify for future use cases.
Is there a shorter way to write something equivalent to (h(e(l(lo?)?)?)?)?$?
One way is to build the regex is an alternation of possible string versions. This I think should also extend well to more general uses
use warnings;
use strict;
use feature 'say';
my $target = shift || 'hello';
my @strings = (
"Test 1 hello",
"Something else",
"Test 2 hell",
"And also he",
"Test 4 hel"
);
my $re_versions = build_regex($target);
foreach my $string (@strings)
{
if ($string =~ /($re_versions)$/)
{
say "'$string' --> $1";
}
};
sub build_regex {
my ($s) = @_;
my @versions;
while ($s) {
push @versions, quotemeta $s;
chop $s;
}
return join '|', @versions;
}
This isn't shorter (while it certainly can be written in a shorter way) but it should be manageable for refinements in acceptable versions of the string, matching order, etc.
If there is a reason to want a compiled regex back change the function return to
my $re_str = join '|', @versions;
return qr/$re_str/;
where you can now also add flags that may be suitable.
You are looking for a regexp to match following expressions at end of a string : hello, hell, hel, he, h. We can expect that the expression is preceeded by at least once space.
You could just write :
s/\s+(hello$)|(hell$)|(hel$)|(he$)|(h$)// for @strings;
This will modify in-place all elements in the array to what you expect.
I needed, you can generate the match string automatically for any given word :
my $word = "hello";
my @parts = map { substr $word, 0, $_ } (1..(length $word));
my $match = join "|", map { "(" . $_ . "\$)" } @words;
s/\s+$match// for @strings;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With