Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace only some of the matched substrings?

Tags:

java

regex

This a regex question for which I couldn't find an answer yet:

Input:

"the current time is <start time>00:00:00<end time>. at 00:00:00 there is a firework. Another appearance of 00:00:00."

Desired output:

"the current time is <start time>00:00:00<end time>. at <start time>00:00:00<end time> there is a firework. Another appearance of <start time>00:00:00<end time>."

The solution must not involve first splitting the string by sentence.

What I tried:

A simple input.replace(group, replace) won't work because there is already a match that shouldn't be replaced.

    public static void main(String[] args) throws ParseException
    {
       String input = "the current time is <start time>00:00:00<end time>. at 00:00:00 there is a firework. Another appearance of 00:00:00.";
       Pattern p  = Pattern.compile("(<start time>)?(00:00:00)(<end time>)?");
       Matcher m  = p.matcher(input);
       while(m.find())
       {
            if(m.group(1) != null) { continue; }
            String substr1 = input.substring(0, m.start(2));
            String substr2 = input.substring(m.end(2), input.length());
            String repl = "<start time>" + m.group(2) + "<end time>";
            input = substr1 + repl + substr2;
       }
   }
like image 492
tenticon Avatar asked Jan 21 '26 15:01

tenticon


1 Answers

The reason your code isn't working is that you're modifying input within the loop, making the indexes on the match results invalid.

But the good news is you don't need the loop at all, you can use a combination of a negative lookbehind and a negative lookahead (details here) to skip the instances that already have the wrapper automatically, and use replaceAll to do the loop for you:

public static void main(String[] args) throws Exception
{
   String input = "the current time is <start time>00:00:00<end time>. at 00:00:00 there is a firework. Another appearance of 00:00:00.";
   String result = input.replaceAll("(?<!<start time>)00:00:00(?!<end time>)", "<start time>00:00:00<end time>"); 
   // Negative lookbehind -----------^^^^^^^^^^^^^^^^^        ^^^^^^^^^^^^^^
   // Negative lookahead ------------------------------------/
   System.out.println(result);
}

Live Example on IDEone

The negative lookbehind says "don't match if the text has this in front of it" and the negative lookahead says "don't match if the text has this after it."

like image 158
T.J. Crowder Avatar answered Jan 23 '26 05:01

T.J. Crowder



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!