Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In C#, how can I get the start/end indexes of all the replacements by the Regex.Replace() function

Tags:

c#

regex

I have made a program to highlight on the fly, the phrases in the input matched by a given Regex expression.

However, I want to highlight the replacements in the output panel too. To do this, I need to obtain the indexes and lengths found by Regex.Replace(). Unfortunately, it would seem C# doesn't give access to this data. Have I missed something?

I've thought about manually trying to figure out the indexes by accumulating sums given off from the MatchCollection produced by Regex.Matches(). But this is prone to error, and may not take into account the special $ symbol in the replace expression which could throw the figures off.

There must be a more elegant way.

-------------------- EDIT:

Trying to build off sarh's answer, I've got this so far:

public List<Tuple<int, int>> replacementIndexes;

public void mainbody() {
    replacementIndexes = new List<Tuple<int, int>>();
    filteredOutput = Regex.Replace(inputText, pattern, match => interceptReplacements(match, replacementText), regexOptions);
}

public string interceptReplacements(Match m, string replacement)
{
    replacementIndexes.Add(new Tuple<int,int>(m.Index,m.Length));
    return replacement;
}

Unfortunately, the interceptReplacements() method uses the OLD match indexes from the input, and not the new replacement indexes. So we need to get kludgy. Here is a potential 'solution' based on the delta of the replacement lengths versus the match lengths:

int delta = 0;
public List<Tuple<int, int>> replacementIndexes;

public void mainbody() {
    delta = 0;
    replacementIndexes = new List<Tuple<int, int>>();
    filteredOutput = Regex.Replace(inputText, pattern, match => interceptReplacements(match, replacementText), regexOptions);
}

public string interceptReplacements(Match m, string replacement)
{
    replacementIndexes.Add(new Tuple<int,int>(m.Index+delta, replacement.Length));
    delta += replacement.Length - m.Length;
    return replacement;
}

It appeared to work at first, but now a bigger problem has arisen. The $ character (substitution in the replacement) fails to work (it's just treated as a literal). So we're back to square one.

like image 374
Dan W Avatar asked Dec 04 '25 10:12

Dan W


1 Answers

Regex.Replace has an overload with MatchEvaluator which accepts Match object wich has Index (start position) and Value properties, Value can provide match length (you can calculate end position).

EDIT: with your latest sample and my comment it will be something like this (not sure regarding syntax, don't have VS right now..)

int delta = 0;
public List<Tuple<int, int>> replacementIndexes;
Regex rex;

public void mainbody() {
    delta = 0;
    replacementIndexes = new List<Tuple<int, int>>();
    rex = new Regex(pattern);
    filteredOutput = rex.Replace(inputText, match => interceptReplacements(match, replacementText), regexOptions);
}

public string interceptReplacements(Match m, string replacement)
{
    string replacementResult = rex.Replace(m.ToString(), replacement);
    replacementIndexes.Add(new Tuple<int,int>(m.Index+delta, replacementResult.Length));
    delta += replacementResult.Length - m.Length;
    return replacementResult;
}
like image 192
sarh Avatar answered Dec 07 '25 00:12

sarh