Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I obtain what .* matched in a regular expression?

Tags:

java

regex

I have thousands of different regular expressions and they look like this:

^Mozilla.*Android.*AppleWebKit.*Chrome.*OPR\/([0-9\.]+)

How do I obtain those substrings that match the .* in the regex? For example, for the above regex, I would get four substrings for four different .*s. In addition, I don't know in advance how many .*s there are, even though I can possibly find out by doing some simple operation on the given regex string, but that would impose more complexity on the program. I process a fairly big amount of data, so really focus on the efficiency here.

like image 416
Simo Avatar asked Jan 18 '26 10:01

Simo


2 Answers

Replace the .*s with (.*)s and use matcher.group(n). For instance:

Pattern p = Pattern.compile("1(.*)2(.*)3");
Matcher m = p.matcher("1abc2xyz3");
m.find();

System.out.println(m.group(2));
xyz

Notice how the match of the second (.*) was returned (since m.group(2) was used).

Also, since you mentioned you won't know how many .*s your regex will contain, there is a matcher.groupCount() method you can use, if the only capturing groups in your regex will indeed be (.*)s.

For your own enlightenment, try reading about capturing groups.

like image 115
arshajii Avatar answered Jan 21 '26 03:01

arshajii


How do I get those substrings that match the .* in the regex? For example, for the above regex, I would get four substrings for four different DOT STAR.

Use groups: (.*)


I addition, I don't know in advance how many DOT STARs there are

Build your regex string, then replace .* with (.*):

String myRegex = "your regex here";
myRegex = myRegex.replace(".*","(.*)");

even though I can possible find out about that by doing some simple operation on the given regex string, but that would impose more complexity on the program

If you don't know how the regex is made and the regex is not built by your application, the only way is to process it after you have it. If you are building the regex, then append (.*) to the regex string instead of appending .*

like image 29
BackSlash Avatar answered Jan 21 '26 01:01

BackSlash



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!