Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php multi byte strings regex

Tags:

php

utf-8

We have a regex to strip out non alpha numeric characters except for '#', '&' and '-'. Here is what it looks like:

preg_replace('/[^a-zA-Z0-9#&-*]/', '', strtolower($title));

Now we need to support traditional Chinese strings and the above function won't work. How can I implement similar functionality for traditional Chinese.

Thanks,

like image 783
user824212 Avatar asked May 02 '26 04:05

user824212


1 Answers

Use u modifier:

preg_replace(`/[^a-zA-Z0-9#&-*诶]/u`, '', $string);

By the way, don't use strtolower(), because it will break your string. Use mb_strtolower():

mb_strtolower($string, 'UTF-8');
like image 50
Karolis Avatar answered May 03 '26 16:05

Karolis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!