Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.NET Regex overlaping matches taking last character

Tags:

.net

regex

I have this RegEx that finds any permutation with one A, one B and two C's

(?:(?<A>A)|(?<B>B)|(?<C>C)){4}(?<-A>)(?<-B>)(?<-C>){2}

for example for this combination we have 3 matches (positions 1, 7, 15)

ABCCABCABCABCAABCC

If I add a lookahead assertion I can count the number of coincidences starting on the next position rather than the next position after the complete sequence

(?=(?<value>(?:(?<A>A)|(?<B>B)|(?<C>C)){4}(?<-A>)(?<-B>)(?<-C>){2}))
   ^                                                               ^

And we'd have 7 matches in this example

1. ABCC
2. BCCA
3. CCAB
4. CABC
7. CABC
10. CABC
15. ABCC

As stribizhev helped in this previous post: .NET Regex number of overlaping matches

Now I need to find a sequence of all the possible combinations of, for example, ABC, but 3 times and overlapping one character.

For example, for the following sequence:

AABCBACBCCAACCB

This would have the sequence in position 1

Pos 1. ABC
Pos 3. CBA
Pos 5. ACB

So it looks a sequence where we have any combination of ABC that appears 3 times in a row, but taking as the first character the last one of the previous match.

I hope I explained well..

How can I do this?

like image 581
John Mathison Avatar asked Dec 27 '25 18:12

John Mathison


1 Answers

You can achieve this with simple modification to @stribizhev solution.

First, you have only C not two:

(?:(?<A>A)|(?<B>B)|(?<C>C)){3}(?<-A>)(?<-B>)(?<-C>)

As you want to start new match from last character, you can use lookahead assertion and capture only two character after it:

(?=(?:(?<A>A)|(?<B>B)|(?<C>C)){3}(?<-A>)(?<-B>)(?<-C>))..

Now you just repeat that three times and capture just one last character:

(?:(?=(?:(?<A>A)|(?<B>B)|(?<C>C)){3}(?<-A>)(?<-B>)(?<-C>))..){3}.
like image 196
user4003407 Avatar answered Dec 30 '25 10:12

user4003407