Unicode Encode Error: 'ascii' codec can't encode character u'\u2019'

Question

I'm trying to read html file but when sourcing out for the titles and urls to compare with my keyword 'alist' I get this error Unicode Encode Error: 'ascii' codec can't encode character u'\u2019'. Error in link(http://tinypic.com/r/307w8bl/8)

Code

for q in soup.find_all('a'):
    title = (q.get('title'))
    url = ((q.get('href')))
    length = len(alist)
    i = 0
    while length > 0:
        if alist[i] in str(title): #checks for keywords from html form from the titles and urls
            r.write(title)
            r.write("
")
            r.write(url)
            r.write("
")
        i = i + 1
        length = length -1
doc.close()
r.close()

A little background. alist contains a list of keywords which I would use to compare it with title so as to get what I want. The strange thing is if alist contains 2 or more words, it would run perfectly but if there was only one word, the error as seen above would appear. Thanks in advance.

xecgr · Accepted Answer

If your list MUST BE a string list, try to encode title var

>>> alist=['á'] #asci string
>>> title = u'á' #unicode string
>>> alist[0] in title
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> title and alist[0] in title.encode('utf-8')
True
>>>

Unicode Encode Error: 'ascii' codec can't encode character u'\u2019'

Tags:

python

unicode

Jmo

1 Answers

xecgr

Recent Activity

Donate For Us

Unicode Encode Error: 'ascii' codec can't encode character u'\u2019'

Tags:

python

unicode

Jmo

1 Answers

xecgr

Related questions

Recent Activity

Donate For Us