Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular epxressions that matches the longest repeating sequence

I want to match the longest sequence that is repeating at least once

Having:

T_send_ack-new_amend_pending-cancel-replace_replaced_cancel_pending-cancel-replace_replaced

the result should be: pending-cancel-replace_replaced

like image 991
mircea . Avatar asked Dec 18 '25 23:12

mircea .


2 Answers

Try this

(.+)(?=.*\1)

See it here on Regexr

This will match any character sequence with at least one character, that is repeated later on in the string.

You would need to store your matches and decide which one is the longest afterwards.

This solution requires your regex flavour to support backreferences and lookaheads.

it will match any character sequence with at least one character .+ and store it in the group 1 because of the brackets around it. The next step is the positive lookahead (?=.*\1), it will be true if the captured sequence occurs at a later point again in the string.

like image 108
stema Avatar answered Dec 21 '25 18:12

stema


Here a perl script that does the job:

#!/usr/bin/perl 
use strict;
use warnings;
use 5.010;

my $s = q/T_send_ack-new_amend_pending-cancel-replace_replaced_cancel_pending-cancel-replace_replaced/;
my $max = 0;
my $seq = '';
while($s =~ /(.+)(?=.*\1)/g) {
    if(length$1 > $max) {
        $max = length $1;
        $seq = $1;
    }
}
say "longuest sequence : $seq, length = $max"

output:

longuest sequence : _pending-cancel-replace_replaced, length = 32
like image 45
Toto Avatar answered Dec 21 '25 19:12

Toto



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!