Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: applying function to each column of an array

Tags:

python

I need to apply a function to each column of a numpy array. I can't do it for each element of the array but it must be each column as each column combined together represents an information.

import numpy as np
C = np.random.normal(0, 1, (500, 30))

Is this the most efficient way to do this (for illustration I am using np.sum):

C2 = [ np.sum( C[ :, i ] )  for i in range( 0, 30) ]

The array C is 500x4000 and I am applying a time consuming function to each column as well.

like image 345
Zanam Avatar asked Sep 06 '25 03:09

Zanam


2 Answers

You can try np.apply_along_axis:

In [21]: A = np.array([[1,2,3],[4,5,6]])

In [22]: A
Out[22]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [23]: np.apply_along_axis(np.sum, 0, A)
Out[23]: array([5, 7, 9])

In [24]: np.apply_along_axis(np.sum, 1, A)
Out[24]: array([ 6, 15])
like image 181
tobias_k Avatar answered Sep 07 '25 20:09

tobias_k


It appears to take ~75% of the time to use this instead:

[ np.sum(row) for row in C.T ]

It also is more Pythonic. For reference, these are the timeit results.

>>> timeit('[ np.sum( C[ :, i ] )  for i in range( 0, 30) ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.418906474798
>>> print timeit('[ np.sum(row) for row in C.T ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.345153254432
>>> print timeit('np.apply_along_axis(np.sum, 0, C)', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.732931300891
like image 39
Jared Goguen Avatar answered Sep 07 '25 21:09

Jared Goguen