String original = "This is a sentence.Rajesh want to test the application for the word split.";
List matchList = new ArrayList();
Pattern regex = Pattern.compile(".{1,10}(?:\\s|$)", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(original);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
System.out.println("Match List "+matchList);
I need to parse text into an array of lines that do not exceed 10 characters in length and should not have a break in word at the end of the line.
I used below logic in my scenario but the problem it is parsing to the nearest white space after 10 characters if there is a break at end of line
for eg: The actual sentence is "This is a sentence.Rajesh want to test the application for the word split." But after logic execution its getting as below.
Match List [This is a , nce.Rajesh , want to , test the , pplication , for the , word , split.]
OK, so I've managed to get the following working, with max line length of 10, but also splitting the words that are longer than 10 correctly!
String original = "This is a sentence. Rajesh want to test the applications for the word split handling.";
List matchList = new ArrayList();
Pattern regex = Pattern.compile("(.{1,10}(?:\\s|$))|(.{0,10})", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(original);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
System.out.println("Match List "+matchList);
This is the result:
This is a
sentence.
Rajesh want
to test
the
applicatio
ns word
split
handling.
This question was tagged as Groovy at some point. Assuming a Groovy answer is still valid and you are not worried about preserving multiple white spaces (e.g. ' '):
def splitIntoLines(text, maxLineSize) {
def words = text.split(/\s+/)
def lines = ['']
words.each { word ->
def lastLine = (lines[-1] + ' ' + word).trim()
if (lastLine.size() <= maxLineSize)
// Change last line.
lines[-1] = lastLine
else
// Add word as new line.
lines << word
}
lines
}
// Tests...
def original = "This is a sentence. Rajesh want to test the application for the word split."
assert splitIntoLines(original, 10) == [
"This is a",
"sentence.",
"Rajesh",
"want to",
"test the",
"application",
"for the",
"word",
"split."
]
assert splitIntoLines(original, 20) == [
"This is a sentence.",
"Rajesh want to test",
"the application for",
"the word split."
]
assert splitIntoLines(original, original.size()) == [original]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With