I have a unique problem with multibyte character strings and need to be able to shuffle, with some fair degree of randomness, a long UTF-8 encoded multibyte string in PHP without dropping or losing or repeating any of the characters.
In the PHP manual under str_shuffle there is a multi-byte function (the first user submitted one) that doesn't work: If I use a string with for example all the Japanese hiragana and katakana of string length (ex) 120 chars, I am returned a string that's 119 chars or 118 chars. Sometimes I've seen duplicate chars even though the original string doesn't have them. So that's not functional.
To make this more complex, I also need to include if possible Japanese UTF-8 newlines and line feeds and punctuation.
Can anyone with experience dealing in multiple languages with UTF-8 mb strings help? Does PHP have any built in functions to do this? str_shuffle is EXACTLY what I want. I just need it to also work on multibyte chars.
Thanks very much!
Try splitting the string using mb_strlen
and mb_substr
to create an array, then using shuffle
before joining it back together again. (Edit: As also demonstrated in @Frosty Z's answer.)
An example from the PHP interactive prompt:
php > $string = "Pretend I'm multibyte!";
php > $len = mb_strlen($string);
php > $sploded = array();
php > while($len-- > 0) { $sploded[] = mb_substr($string, $len, 1); }
php > shuffle($sploded);
php > echo join('', $sploded);
rmedt tmu nIb'lyi!eteP
You'll want to be sure to specify the encoding, where appropriate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With