I have a UTF8 table in MySQL containing names, with all types of text (numeric, capitals, greek, cycrillic etc).
---------------
ID   Name
---------------
001  Jane Smith
002  John Doe
003  Джемз Смис
004  Пэтра Смис
005  "Groove" Holme
006  99er Dude
How can I select only the cyrillic names? (records 003 and 004)
EDIT
Thanks for the answer below, which looked like it would be correct, but didn't work. More research turned up this in the documentation:
Warning
The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.
EDIT EDIT, A SOLUTION
I solved this by adding an extra field to my database which stores the script type, eg Cyrillic, Thai etc. Then a ran a batch process in PHP that detects the script and stores the information in the database.
To detect the script in PHP, use Unicode regex functions. See this page:
http://www.regular-expressions.info/unicode.html
I don't know if it covers all characters, but you should be able to use this:
SELECT * FROM table WHERE Name REGEXP '[Α-Ωα-ωА-Яа-я]'
which will return all entries containing at least one cyrillic or greek character
EDIT
Check "Warning" in OP's question about REGEXP/RLIKE working in byte-wise fashion so they might not work for multi-byte character sets.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With