I'm performing regex matching in .NET against strings that look like this:
1;#Lists/General Discussion/Waffles Win 2;#Lists/General Discussion/Waffles Win/2_.000 3;#Lists/General Discussion/Waffles Win/3_.000
I need to match the URL portion without the numbers at the end, so that I get this:
Lists/General Discussion/Waffles Win
This is the regex I'm trying:
(?:\d+;#)(?<url>.+)(?:/\d+_.\d+)*
The problem is that the last group is being included as part of the middle group's match. I've also tried without the * at the end but then only the first string above matches and not the rest.
I have the multi-line option enabled. Any ideas?
A few different alternatives:
@"^\d+;#([^/]+(?:/[^/]+)*?)(?:/\d+_\.\d+)?$"
This matches as few path segments as possible, followed by an optional last part, and the end of the line.
@"^\d+;#([^/]+(?:/(?!\d+_\.\d+$)[^/]+)*)"
This matches as many path segments as possible, as long as it is not the digit-part at the end of the line.
@"^\d+;#(.*?)(?:/\d+_\.\d+)?$"
This matches as few characters as possible, followed by an optional last part, and the end of the line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With