I have a string in the arabic language like:
على احمد يوسف
Now I need to cut this string and output it like:
...على احمد يو
I tried this function:
function short_name($str, $limit) {
if ($limit < 3) {
$limit = 3;
}
if (strlen($str) > $limit) {
if (preg_match('/\p{Arabic}/u', $str)) {
return substr($str, 0, $limit - 3) . '...';
}
else {
return '...'.substr($str, 0, $limit - 3);
}
}
else {
return $str;
}
}
The problem is that sometimes it displays a symbol like this at the end of the string:
...�على احمد يو
Why does this happen?
The symbol displayed after the cut is the result of substr()
cutting in the middle of a character, resulting in an invalid character.
You need to use Multibyte String Functions to handle arabic strings, such as mb_strlen()
and mb_substr()
.
You also need to make sure the internal encoding for those functions is set to UTF-8
. You can set this globally at the top of your script:
mb_internal_encoding('UTF-8');
Which leads to this:
strlen('على احمد يوسف')
returns 24, the size in octetsmb_strlen('على احمد يوسف')
returns 13, the size in charactersNote that mb_strlen('على احمد يوسف')
would also return 24 if the internal encoding was still set to the default ISO-8859-1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With