Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python using json to read a string with emoticons

I have a giant .json file

Im reading it with

json_data=open('file.json')
data = json.load(json_data)


for item in data['payload']['actions']:
    print item['author']
    print item['action_id']
    print item['body']
json_data.close()

eventually one of the item['body'] contains this string (which are actually facebook emoticons) :

words words stuff stuff\ud83c\udf89\ud83c\udf8a\ud83c\udf87\ud83c\udf86\ud83c\udf08\ud83d\udca5\u2728\ud83d\udcab\ud83d\udc45\ud83d\udeb9\ud83d\udeba\ud83d\udc83\ud83d\ude4c\ud83c\udfc3\ud83d\udc6c

which makes it give this error:

Traceback (most recent call last):
  File "curse.py", line 15, in <module>
    print item['body']
  File "C:\python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 35-63: character maps to <undefined>

Is there a way to make it ignore these?

like image 802
MikeVaughan Avatar asked Feb 18 '26 20:02

MikeVaughan


1 Answers

You can use string.printable

import string

try:
    print item['body']
except UnicodeEncodeError:
    print(''.join(c for c in item['body'] if c in string.printable))
like image 166
Alex Avatar answered Feb 21 '26 10:02

Alex



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!