Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loading numpy structured array saved in python3 in python2

It does not appear to be possible to load numpy structured arrays saved in python3 within python2 because the field names are unicode strings.

$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
>>> a = np.zeros(4, dtype=[('x',int)])
>>> np.save('a.npy', {'a': a})
>>> 
$ python2
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
>>> np.load('a.npy')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 393, in load
    return format.read_array(fid)
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/format.py", line 602, in read_array
array = pickle.load(fp)
ValueError: non-string names in Numpy dtype unpickling

This has been a numpy bug for quite some time: https://github.com/numpy/numpy/issues/2407

Does anyone have a work around to be able to load numpy structured arrays from python3 in python2 (without having to load and re-save in python3)?

like image 631
jmlarson Avatar asked Mar 05 '26 03:03

jmlarson


1 Answers

I don't think this is a unicode field name issue.

In python3 I can save an object array:

In [133]: b=np.array([[1],[1,2],[1,2,3]])
In [134]: np.save('a.npy',b)
In [135]: np.load('a.npy')
Out[135]: array([[1], [1, 2], [1, 2, 3]], dtype=object)

in python2

In [260]: np.load('a.npy')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-260-76b2da2985df> in <module>()
----> 1 np.load('a.npy')

/usr/local/lib/python2.7/site-packages/numpy/lib/npyio.pyc in load(file, mmap_mode)
    386                 return format.open_memmap(file, mode=mmap_mode)
    387             else:
--> 388                 return format.read_array(fid)
    389         else:
    390             # Try a pickle

/usr/local/lib/python2.7/site-packages/numpy/lib/format.pyc in read_array(fp)
    451     if dtype.hasobject:
    452         # The array contained Python objects. We need to unpickle the data.
--> 453         array = pickle.load(fp)
    454     else:
    455         if isfileobj(fp):

TypeError: must be char, not unicode

The error isn't in quite the same place, but it still involves pickle.load. I get the same error if I save {'a':a}.

With the dictionary wrapper, python3 load gets

array({'a': array([(0,), (0,), (0,), (0,)], 
      dtype=[('x', '<i4')])}, dtype=object)

As moamingsun points out, if you save the a array without the dictionary wrapper, the python2 load works fine.

The problem isn't with field names, but with Python 3 v 2 pickling. np.save passes the buck to pickle if it has to save Python objects. I'm sure the py2 v 3 pickling compatibility has been discussed in depth elsewhere.

like image 138
hpaulj Avatar answered Mar 08 '26 04:03

hpaulj