Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex Replace repeated captures

I'm creating a log4net appender that generates NHibernate SQL scripts ready for execution.

I want to use Regex to replace log4net's output to a ready for use script.
A sample input would be

command 5:UPDATE [PlanParameter] SET Mode = @p0, DefaultValueString = @p1, ParameterID = @p2 WHERE ID = @p3;@p0 = 1 [Type: Int16 (0)], @p1 = '0' [Type: String (4000)], @p2 = 2 [Type: Int32 (0)], @p3 = 1362 [Type: Int32 (0)]

Which I want to replace with

UPDATE [PlanParameter] SET Mode = 1, DefaultValueString = '0', ParameterID = 2 WHERE ID = 1362

I have created the following Regex:

command \d+:(?<Query>(?:(?<PreText>[\w\s\[\]]+ = )(@p\d+)(?<PostText>,?))+);(?<Parameters>(?:@p\d+ = ('?\w+'?) \[Type: \w+ \(\d+\)\],? ?)+)

which matches and captures my samples perfectly:

Expresso matches output

I wanted the entire replacement to be handled by the Regex engine. I thought I can use a replacement string such as this:

${PreText}$2${PostText}

but that only yields the last capture, and not my final goal.

In the meantime I've used C# to make it happen:

    Regex reg = new Regex(@"command \d+:(?<Query>(?:(?<PreText>[\w\s\[\]]+ = )(@p\d+)(?<PostText>,?))+);(?<Parameters>(?:@p\d+ = ('?\w+'?) \[Type: \w+ \(\d+\)\],? ?)+)", RegexOptions.Compiled);
    string sample = @"command 5:UPDATE [PlanParameter] SET Mode = @p0, DefaultValueString = @p1, ParameterID = @p2 WHERE ID = @p3;@p0 = 1 [Type: Int16 (0)], @p1 = '0' [Type: String (4000)], @p2 = 2 [Type: Int32 (0)], @p3 = 1362 [Type: Int32 (0)]";
    Match match = reg.Match(sample);
    string result = match.Groups["Query"].Value;
    for (int i = 0; i < match.Groups[1].Captures.Count; i++)
    {
        Capture capture = match.Groups[1].Captures[i];
        result = result.Replace(capture.Value, match.Groups[2].Captures[i].Value);
    }

This works perfectly but I'm sure there's a more clean and neat way of doing this. Maybe with a different Regex expression perhaps?

Any help would be appreciated.

like image 248
Jony Adamit Avatar asked Feb 02 '26 13:02

Jony Adamit


1 Answers

Here is a more compact regex approach:

Search: = (@p\d+)(?=.*?\1 (= [^\[]+))|;(?!.*= @p\d).*

Replace: ${2}

This substitutes all the parameters with their values and erases the end of the string.

See the Substitution pane at the bottom of the regex demo.

Output:

command 5:UPDATE [PlanParameter] SET Mode = 1 , DefaultValueString = '0' , ParameterID = 2 WHERE ID = 1362 

Sample C#

String replaced = Regex.Replace(yourString, @"= (@p\d+)(?=.*?\1 (= [^\[]+))|;(?!.*= @p\d).*", "${2}");

Explanation

  • The parentheses in (@p\d+) capture @p and digits to Group 1
  • The lookahead (?=.*?\1 (= [^\[]+)) asserts that what follows is...
  • .*? match any chars up to...
  • \1 what was matched by Group 1 (e.g. @p0)
  • The parentheses in (= [^\[]+)) capture to Group 2 the literal =, all chars that are not a [ (which we're using as a delimiter to know when your value ends. This is your value
  • OR... | we'll also match the end of the string, and since there is no Group 2 when it is matched, the replacement ${2} will nix it
  • ; semi-colon
  • For safety, the negative lookahead (?!.*= @p\d) asserts that what follows is not any chars then = @p + digit
  • .* matches a semi-colon and all chars to the end of the string
  • The replacement string ${2} is = and Group 2 (the value)

Reference

  • Lookahead and Lookbehind Zero-Length Assertions
  • Mastering Lookahead and Lookbehind
  • Everything about Regex Capture Groups
like image 83
zx81 Avatar answered Feb 05 '26 03:02

zx81