Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError when using a Python string handling function

I'm doing this:

word.rstrip(s)

Where word and s are strings containing unicode characters.

I'm getting this:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

There's a bug report where this error happens on some Windows Django systems. However, my situation seems unrelated to that case.

What could be the problem?


EDIT: The code is like this:

def Strip(word):
    for s in suffixes:
        return word.rstrip(s)

like image 559
Velvet Ghost Avatar asked Jun 13 '26 14:06

Velvet Ghost


1 Answers

The issue is that s is a bytestring, while word is a unicode string - so, Python tries to turn s into a unicode string so that the rstrip makes sense. The issue is, it assumes s is encoded in ASCII, which it clearly isn't (since it contains a character outside the ASCII range).

So, since you intitialise it as a literal, it is very easy to turn it into a unicode string by putting a u in front of it:

suffixes = [u'ি']

Will work. As you add more suffixes, you'll need the u in front of all of them individually.

like image 196
lvc Avatar answered Jun 17 '26 15:06

lvc



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!