How to load a pickle file containing a dictionary with unicode characters?

Question

I have a dictionary:

mydict={'öö':1,'ää':2}

I have written it to a pickle file:

a=codecs.open(r'mydict.pkl', 'wb', 'utf-8')
pickle.dump(mydict, a)

If I try to load it:

m=codecs.open(r'mydict.pkl', 'rb', 'utf-8')
mydict = pickle.load(m)

I get an error:

KeyError: u"S'\xe4\xe4'
p1
I2
sS'\xf6\xf6'
p2
I1
s."

Any ideas how to solve this? Help is greatly appriciated.

Niklas B. · Accepted Answer

pickle is a binary format, using codec translations before writing will break it. Try to just write to a file and loading it back:

>>> mydict={'öö':1,'ää':2}
>>> mydict
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}
>>> pickle.dump(mydict, open('/tmp/test.pkl', 'wb'))
>>> pickle.load(open('/tmp/test.pkl', 'rb'))
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}

But most probably you want to use Unicode in the first place:

>>> mydict={u'öö':1,u'ää':2}

Geoff Reedy · Answer

I believe the problem is the use of codecs.open. Pickles are binaries not text and codec is for transparent conversion from some text encoding to unicode. You should just use open instead.

JSBach · Answer

Old issue but... I have had the same problem and I didn't think extra disk IO is a fine solution. I suggest you using base64 encode/decoding.

import base64

serialized_str = base64.b64encode(pickle.dumps(mydict))
my_obj_back = pickle.loads(base64.b64decode(serialized_str))

Even cPickle could be used same way for faster results in batches.

How to load a pickle file containing a dictionary with unicode characters?

Tags:

python

dictionary

unicode

pickle

root

3 Answers

Niklas B.

Geoff Reedy

JSBach

Recent Activity

Donate For Us

How to load a pickle file containing a dictionary with unicode characters?

Tags:

python

dictionary

unicode

pickle

root

3 Answers

Niklas B.

Geoff Reedy

JSBach

Related questions

Recent Activity

Donate For Us