Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum array by number in numpy

Tags:

python

numpy

Assuming I have a numpy array like: [1,2,3,4,5,6] and another array: [0,0,1,2,2,1] I want to sum the items in the first array by group (the second array) and obtain n-groups results in group number order (in this case the result would be [3, 9, 9]). How do I do this in numpy?

like image 294
Scribble Master Avatar asked Sep 04 '25 16:09

Scribble Master


2 Answers

The numpy function bincount was made exactly for this purpose and I'm sure it will be much faster than the other methods for all sizes of inputs:

data = [1,2,3,4,5,6]
ids  = [0,0,1,2,2,1]

np.bincount(ids, weights=data) #returns [3,9,9] as a float64 array

The i-th element of the output is the sum of all the data elements corresponding to "id" i.

Hope that helps.

like image 122
Alex Avatar answered Sep 07 '25 17:09

Alex


This is a vectorized method of doing this sum based on the implementation of numpy.unique. According to my timings it is up to 500 times faster than the loop method and up to 100 times faster than the histogram method.

def sum_by_group(values, groups):
    order = np.argsort(groups)
    groups = groups[order]
    values = values[order]
    values.cumsum(out=values)
    index = np.ones(len(groups), 'bool')
    index[:-1] = groups[1:] != groups[:-1]
    values = values[index]
    groups = groups[index]
    values[1:] = values[1:] - values[:-1]
    return values, groups
like image 35
Bi Rico Avatar answered Sep 07 '25 16:09

Bi Rico