Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError: 'ascii' codec can't decode '\xc3\xa8' together with '\xe8'

I am having this strange problem below:

>>> a=u'Pal-Andr\xe8'
>>> b='Pal-Andr\xc3\xa8'
>>> print "%s %s" % (a,b) # boom
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
>>> print "%s" % a
Pal-Andrè
>>> print "%s" % b
Pal-Andrè

Where I can print a, b separately but not both.

What's the problem? How can I print them both?

like image 217
chfw Avatar asked Oct 15 '25 19:10

chfw


1 Answers

The actual problem is

b = 'Pal-Andr\xc3\xa8'

Now, b has a string literal not a unicode literal. So, when you are printing them as strings separately, a is treated as a Unicode String and b is treated as a normal string.

>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
'Pal-Andr\xc3\xa8'

Note the u at the beginning is missing. You can confirm further

>>> type("%s" % b)
<type 'str'>
>>> type("%s" % a)
<type 'unicode'>

But when you are printing them together, string becomes a unicode string and \xc3 is not a valid ASCII code and that is why the code is failing.

To fix it, you simply have to declare b also as a unicode literal, like this

>>> a=u'Pal-Andr\xe8'
>>> b=u'Pal-Andr\xc3\xa8'
>>> "%s" % a
u'Pal-Andr\xe8'
>>> "%s" % b
u'Pal-Andr\xc3\xa8'
>>> "%s %s" % (a, b)
u'Pal-Andr\xe8 Pal-Andr\xc3\xa8'
like image 123
thefourtheye Avatar answered Oct 17 '25 09:10

thefourtheye