Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maxlength of HTML input with UTF8 supplementary characters

I would like to enable my users an option to enter EMoji characters in an input field. I assume that in 2019 this should be as trivial as setting the meta charset of the website to UTF-8. However when tested in Chrome or Firefox the below example counts supplementary UTF-8 characters (with length 4 bytes) differently.
In the first input I can only enter 2 more characters after the poop. In the second input I can still enter 3 more characters after ‰ which is 3 bytes long.

What is causing this inconsistent behaviour? Is there another HTML meta setting for 4 byte characters? It worked fine in Edge 17. Even trash IE 11 counts the length correctly.

<input type="text" value="💩" maxlength="4" />
<input type="text" value="‰" maxlength="4" />

My Test cases: http://jsfiddle.net/L726ryea/7/

like image 341
Dharman Avatar asked Oct 19 '25 03:10

Dharman


1 Answers

The HTML5 spec says that maxlength applies to the JavaScript string length which is the number of UTF-16 code units. So codepoints beyond 0xFFFF like Emojis count as two code units. This explains the behavior you're seeing.

like image 53
nwellnhof Avatar answered Oct 21 '25 18:10

nwellnhof



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!