Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are groups being captured from the wrong region in my regex? (Java 7)

Tags:

java

regex

Data is being captured from the wrong region. It is capturing the ip address field and subnet from eth1 instead of eth0. I do not understand why this is happening. I also tried matcher.find(0) but got the same result.

String[] dataNames = new String[]{"eth0Ip", "eth0Subnet"}
dataExtractionPattern = Pattern.compile("eth0 .*inet (?<eth0Ip>\\S+)  mask (?<eth0Subnet>\\S+)",Pattern.DOTALL);

Matcher matcher = dataExtractionPattern.matcher(receivedDataString);
if (matcher.find()) {
    for (String key : dataNames) {
        String dataValue;
        dataValue = matcher.group(key);
        extractedData.put(key, dataValue);
    }
    hasData = true;
}

Input string is:

lo0     Link type:Local loopback  Queue:none
        inet 127.0.0.1  mask 255.255.255.255
        UP RUNNING LOOPBACK
        MTU:1500  metric:1  VR:0
        RX packets:4 mcast:0 errors:0 dropped:1
        TX packets:4 mcast:0 errors:0
        collisions:0 unsupported proto:0
        RX bytes:172  TX bytes:172

eth0    Link type:Ethernet  HWaddr 00:25:f2:5e:9c:34  Queue:none
        inet 10.1.2.2  mask 10.1.2.1  broadcast 255.255.255.254
        RUNNING BROADCAST
        MTU:1000  metric:1  VR:0
        RX packets:0 mcast:0 errors:0 dropped:0
        TX packets:0 mcast:0 errors:0
        collisions:0 unsupported proto:0
        RX bytes:0  TX bytes:0

eth1    Link type:Ethernet  HWaddr 00:25:f2:5e:9c:33  Queue:none
        inet 192.168.200.51  mask 255.255.255.0  broadcast 192.168.200.255
        UP RUNNING BROADCAST
        MTU:1500  metric:1  VR:0
        RX packets:0 mcast:0 errors:0 dropped:0
        TX packets:0 mcast:0 errors:0
        collisions:0 unsupported proto:0
        RX bytes:0  TX bytes:0

for eth0 ip, it is incorrectly capturing 192.168.200.51 and mask 255.255.255.0

like image 917
likejudo Avatar asked Dec 06 '25 16:12

likejudo


1 Answers

As nhahtdh mentioned, the .* part is greedy and will match as much as it can, means everything till the last character where the rest of your pattern is following.

You can change the matching behavour of the quantifiers to "ungreedy/lazy" by adding a ? after them:

dataExtractionPattern = Pattern.compile("eth0 .*?inet (?<eth0Ip>\\S+)  mask (?<eth0Subnet>\\S+)",Pattern.DOTALL);

This will match as less as possible, so that you find the first occurence of inet (?<eth0Ip>\\S+) mask (?<eth0Subnet>\\S+).

like image 61
stema Avatar answered Dec 08 '25 05:12

stema



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!