Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use String methods on UTF-8 characters?

Tags:

ruby

How do I use String methods on UTF-8 characters?

For example, I have a string with Cyrillic characters, so when I use string.upcase it doesn't work.

like image 386
adaxa Avatar asked Nov 22 '25 09:11

adaxa


1 Answers

Ruby only supports case conversions on the letters AZ and az.

The reason for this is simply that case conversions for other letters aren't well defined. For example, in Turkish 'I'.downcase # => 'ı' and 'i'.upcase # => 'İ', but in French 'I'.downcase # => 'i' and 'i'.upcase # => 'I'. Ruby would have to know not only the character encoding, but also the language to do that correctly.

Even worse, in German

'MASSE'.downcase

is either

'maße'   # "measurements"
'masse'  # "mass"

In other words: you need to actually understand the text, i.e. you need a full-blown AI, to do case conversions correctly.

And I myself have actually accidentally constructed a sentence once, which was undecidable even for a human.

In short: it's simply impossible to do correctly, which is why Ruby doesn't do it at all. There are third-party libraries, however, like the Unicode library and ActiveSupport, which do support a somewhat larger subset of characters.

like image 138
Jörg W Mittag Avatar answered Nov 24 '25 21:11

Jörg W Mittag



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!