I have thousands of different regular expressions and they look like this:
^Mozilla.*Android.*AppleWebKit.*Chrome.*OPR\/([0-9\.]+)
How do I obtain those substrings that match the .* in the regex? For example, for the above regex, I would get four substrings for four different .*s. In addition, I don't know in advance how many .*s there are, even though I can possibly find out by doing some simple operation on the given regex string, but that would impose more complexity on the program. I process a fairly big amount of data, so really focus on the efficiency here.
Replace the .*s with (.*)s and use matcher.group(n). For instance:
Pattern p = Pattern.compile("1(.*)2(.*)3");
Matcher m = p.matcher("1abc2xyz3");
m.find();
System.out.println(m.group(2));
xyz
Notice how the match of the second (.*) was returned (since m.group(2) was used).
Also, since you mentioned you won't know how many .*s your regex will contain, there is a matcher.groupCount() method you can use, if the only capturing groups in your regex will indeed be (.*)s.
For your own enlightenment, try reading about capturing groups.
How do I get those substrings that match the .* in the regex? For example, for the above regex, I would get four substrings for four different DOT STAR.
Use groups: (.*)
I addition, I don't know in advance how many DOT STARs there are
Build your regex string, then replace .* with (.*):
String myRegex = "your regex here";
myRegex = myRegex.replace(".*","(.*)");
even though I can possible find out about that by doing some simple operation on the given regex string, but that would impose more complexity on the program
If you don't know how the regex is made and the regex is not built by your application, the only way is to process it after you have it. If you are building the regex, then append (.*) to the regex string instead of appending .*
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With