I need to apply a function to each column of a numpy array. I can't do it for each element of the array but it must be each column as each column combined together represents an information.
import numpy as np
C = np.random.normal(0, 1, (500, 30))
Is this the most efficient way to do this (for illustration I am using np.sum):
C2 = [ np.sum( C[ :, i ] ) for i in range( 0, 30) ]
The array C is 500x4000 and I am applying a time consuming function to each column as well.
You can try np.apply_along_axis
:
In [21]: A = np.array([[1,2,3],[4,5,6]])
In [22]: A
Out[22]:
array([[1, 2, 3],
[4, 5, 6]])
In [23]: np.apply_along_axis(np.sum, 0, A)
Out[23]: array([5, 7, 9])
In [24]: np.apply_along_axis(np.sum, 1, A)
Out[24]: array([ 6, 15])
It appears to take ~75% of the time to use this instead:
[ np.sum(row) for row in C.T ]
It also is more Pythonic. For reference, these are the timeit
results.
>>> timeit('[ np.sum( C[ :, i ] ) for i in range( 0, 30) ]',
setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.418906474798
>>> print timeit('[ np.sum(row) for row in C.T ]',
setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.345153254432
>>> print timeit('np.apply_along_axis(np.sum, 0, C)',
setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.732931300891
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With