I have a dictionary:
mydict={'öö':1,'ää':2}
I have written it to a pickle file:
a=codecs.open(r'mydict.pkl', 'wb', 'utf-8')
pickle.dump(mydict, a)
If I try to load it:
m=codecs.open(r'mydict.pkl', 'rb', 'utf-8')
mydict = pickle.load(m)
I get an error:
KeyError: u"S'\\xe4\\xe4'\np1\nI2\nsS'\\xf6\\xf6'\np2\nI1\ns."
Any ideas how to solve this? Help is greatly appriciated.
pickle is a binary format, using codec translations before writing will break it. Try to just write to a file and loading it back:
>>> mydict={'öö':1,'ää':2}
>>> mydict
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}
>>> pickle.dump(mydict, open('/tmp/test.pkl', 'wb'))
>>> pickle.load(open('/tmp/test.pkl', 'rb'))
{'\xc3\xb6\xc3\xb6': 1, '\xc3\xa4\xc3\xa4': 2}
But most probably you want to use Unicode in the first place:
>>> mydict={u'öö':1,u'ää':2}
I believe the problem is the use of codecs.open. Pickles are binaries not text and codec is for transparent conversion from some text encoding to unicode. You should just use open instead.
Old issue but... I have had the same problem and I didn't think extra disk IO is a fine solution. I suggest you using base64 encode/decoding.
import base64
serialized_str = base64.b64encode(pickle.dumps(mydict))
my_obj_back = pickle.loads(base64.b64decode(serialized_str))
Even cPickle could be used same way for faster results in batches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With