Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript regex insensitive turkish character issue

i'm using regex for filtering some contents.

var word = new RegExp(filterWord,"gi");// "gi" means Global and insensitive
content = content.replace(word, "");//removes "word" from content

This code works properly but when regex get uppercase "İ" it dont replace word.

ex: if

filterWord = istanbul 

and

content = "İstanbul";

Above code not working properly , if i write istanbul to İstanbul ,it is working but this time it is not insensitive , how can i solve this problem ?

like image 770
Erdi Avatar asked Jun 09 '26 10:06

Erdi


2 Answers

you can express lower and upper cases in a bracket

/[İi]stanbul/i

you can see from here

like image 191
engtuncay Avatar answered Jun 12 '26 00:06

engtuncay


How regEx works with Small-Case and Upper-Case chars is based on the Hex-Code of the characters and how they are represented in Unicode consortium of that Unicode set(any language, I hope so as Unicode are based on International Standards).

eg: For English

English

Similarly, we have

Turkish

Above are some highlighted characters with same colors are Upper and Small Case representation of their own and there is only one difference in their Hex-code. for Ê Hex-Code is 00CA and for ê is 00EA with one diffrence C and E at third position.

Similarly for Ý and ý Hex-Code is 00DD and u00FD with one difference D and F

Now check this eg:

'ÊÌÝêìý'.match(/Ì/gi) //case insensitive
//output ["Ì", "ì"]
'ÊÌÝêìý'.match(/Ì/g) //case sensitive
//output ["Ì"]

'ÊÌÝêìý'.match(/Ý/ig) //case insensitive
//output ["Ý", "ý"]
'ÊÌÝêìý'.match(/Ý/g) //case sensitive
//output ["Ý"]

If you are using right Characters then it should work normally. I don't know much about Latin-Turkish Characters.

like image 38
Harpreet Singh Avatar answered Jun 11 '26 22:06

Harpreet Singh



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!