I want to split a utf-8 string.
I have tried the StringTokenizer but it fails.
The title should be "0" but it shows as "عُدي_صدّام_حُسين".
String test = "en.m عُدي_صدّام_حُسين 1 0";
StringTokenizer stringTokenizer = new StringTokenizer(test);
String code = stringTokenizer.nextToken();
String title = stringTokenizer.nextToken();
What is the correct way to split a utf-8 string?
The problem here is that the Arabic text isn't "at the end" of the string.
For example, if I select the contents of the string literal (in Chrome), moving my mouse from left-to-right, it selects the en.m first, then selects all of the arabic text, then the 0 1. The text just looks "at the end" because that's how it is being rendered.
The string, as specified in your Java source code actually does have the عُدي_صدّام_حُسين as the second token. So, you're splitting it correctly, you're just not splitting what you think you're splitting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With