Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to modify a Cow variable that uses itself in a loop?

I am trying to remove all the parentheses in a string. Not thinking about it too hard, I just do a simple regexp replace (i.e. the problem in question is not particularly about getting rid of arbitrary levels of nested parentheses, but feel free to suggest a better way of doing that in a comment if you want).

use regex::Regex;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let input = "Text (with some (nested) parentheses)!";
    let re = Regex::new(r"\([^()]*\)")?;

    let output = re.replace_all(&input, "");
    let output = re.replace_all(&output, "");
    //    let output = re.replace_all(&output, "");
    //    let output = re.replace_all(&output, "");
    //    let output = re.replace_all(&output, "");
    //    let output = re.replace_all(&output, "");
    // ...

    assert_eq!("Text !", output);

    println!("Works!");

    Ok(())
}

Because I do not know how nested the parentheses will be, I need to do the replacement in a loop rather than repeating it "just enough times". Creating a loop, however, creates a new scope and that's where I'm hitting a dead point in the discussion with the borrow checker.

The simplest case that shows what I am trying to do in the loop would be:

    let mut output = re.replace_all(&input, "");
    while re.is_match(&output) {
        output = re.replace_all(&output, "");
    }

However that cannot be done because I am assigning to a borrowed variable:

error[E0506]: cannot assign to `output` because it is borrowed
 --> src/main.rs:9:9
  |
9 |         output = re.replace_all(&output, "");
  |         ^^^^^^                  ------- borrow of `output` occurs here
  |         |
  |         assignment to borrowed `output` occurs here
  |         borrow later used here

What I would like to do, ideally, is to create new variable binding with the same name, but using let output = will shadow the outer variable binding, so the loop would cycle infinitely.

No matter what inner or outer temporary variable I create I cannot make it do what I want. I also tried using the fact that re.replace_all() returns Cow and tried using .to_owned() and .to_string() in a couple of places, but that didn't help either.

Here's a link to a playground.

like image 622
nert Avatar asked Oct 16 '25 04:10

nert


1 Answers

Shepmaster's answer works, but it's not as efficient as it could be. A subtle property of the Cow type is that by inspecting it, we can determine whether the string was modified, and skip work if it wasn't.

Due to constraints of the Rust type system, if the value was not modified then Cow::into_owned() makes a copy. (Cow::into_owned() of a modified value does not copy). (into_owned documentation)

In your use case, we can detect unmodified Cow -- Cow::Borrowed -- and skip into_owned().

    let mut output = /* mutable String */;
    while re.is_match(&output).unwrap() {
        match re.replace_all(&output, "") {
            // Unmodified -- skip copy
            Cow::Borrowed(_) => {}
            // replace_all() returned a new value that we already own
            Cow::Owned(new) => output = new,
        }
    }

But we can go further. Calling both is_match() and replace_all() means the pattern is matched twice. With our new knowledge of Cows, we can optimize that away:

    let mut output = /* mutable String */;
    // Cow::Owned is returned when the string was modified.
    while let Cow::Owned(new) = re.replace_all(&output, "") {
        output = new;
    }

Edit: If your input value is immutable, you can avoid the .to_string() copy by making it Cow as well:

    let input = "value";
    let mut output = Cow::from(input);
    while let Cow::Owned(new) = re.replace_all(&output, "") {
        output = Cow::Owned(new);
    }
like image 63
intgr Avatar answered Oct 17 '25 21:10

intgr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!