I'm trying to go from an ndarray of integer 'flags':
array([[1, 3, 2],
[2, 0, 3],
[3, 2, 0],
[2, 0, 1]])
to an ndarray of strings:
array([['Banana', 'Celery', 'Carrot'],
['Carrot', 'Apple', 'Celery'],
['Celery', 'Carrot', 'Apple'],
['Carrot', 'Apple', 'Banana']],
dtype='|S6')
Using a list of strings as the mapping of 'flags' to 'meanings':
meanings = ['Apple', 'Banana', 'Carrot', 'Celery']
I've come up with the following:
>>> import numpy as np
>>> meanings = ['Apple', 'Banana', 'Carrot', 'Celery']
>>> flags = np.array([[1,3,2],[2,0,3],[3,2,0],[2,0,1]])
>>> flags
array([[1, 3, 2],
[2, 0, 3],
[3, 2, 0],
[2, 0, 1]])
>>> mapped = np.array([meanings[f] for f in flags.flatten()]).reshape(flags.shape)
>>> mapped
array([['Banana', 'Celery', 'Carrot'],
['Carrot', 'Apple', 'Celery'],
['Celery', 'Carrot', 'Apple'],
['Carrot', 'Apple', 'Banana']],
dtype='|S6')
This works, but I'm concerned about the efficiency (list comp, flatten, reshape) of the pertinent line when dealing with large ndarrays:
np.array([meanings[f] for f in flags.flatten()]).reshape(flags.shape)
Is there a better/more efficient way of performing a mapping like this?
Fancy indexing is the numpythonic way of doing it:
mapped = meanings[flags]
or the often faster equivalent:
mapped = np.take(meanings, flags)
I think np.vectorize is the way to go, it's also really clear and easy to follow. I haven't tested the following but it should work.
vfunc = np.vectorize(lambda x : meanings[x])
mapped = vfunc(flags)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With