I'm trying to interpret a multiline string with RegEx and just found that matching will fail if the string contains newline characters. I'm NOT using using MULTILINE mode, because I'm not using anchors. According to API docs:
In multiline mode the expressions ^ and $ match just after or just before, respectively, a line terminator or the end of the input sequence. By default these expressions only match at the beginning and the end of the entire input sequence.
In short: it clearly says that this flag only changes how anchors work and says nothing like "when your string contains a newline you should definitely use this".
public static void main(String[] args) {
Pattern p = Pattern.compile(".*");
Matcher m1 = p.matcher("Hello");
System.out.println("m1: " + m1.matches()); // true
Matcher m2 = p.matcher("Hello\r\n");
System.out.println("m2: " + m2.matches()); // false
}
So is this really a bug, or I just missed some docs? Or JAVA uses a dialect of RegEx where my pattern fails? I'm using jdk1.6.0_21.
From the Pattern docs:
The regular expression
.matches any character except a line terminator unless the DOTALL flag is specified.
So you need to specify the DOTALL flag if you want m2.matches() to be true.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With