Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding closest values in two numpy arrays

The aim here is speed- I am trying to get away from looping through the arrays in question. It can however be assumed that the two arrays are sorted.

a = np.arange(10)
b = np.array([2.3, 3.5, 5.8, 13])
c = somefunc(a,b)

Now somefunc should find the indices of a for which the values in b are closest too, i.e.

In []: c
Out[]: array([2, 3or4, 6, 9])  #3 or 4 depending on python2 or 3

Once again, this could be done with a loop, but I am looking for something a lot faster. I got quite close by taking the absolute difference type approach, something like:

np.argmin(np.abs(a[:, np.newaxis] - b), axis=0)

But even this is a little slow as a lot of unnecessary subtractions are done.

like image 886
RexFuzzle Avatar asked Sep 06 '25 11:09

RexFuzzle


1 Answers

So using the suggestion from @Eelco to use searchsorted, I came to the following which is quicker with a larger dataset than the np.argmin on the vector method:

def finder(a, b):
    dup = np.searchsorted(a, b)
    uni = np.unique(dup)
    uni = uni[uni < a.shape[0]]
    ret_b = np.zeros(uni.shape[0])
    for idx, val in enumerate(uni):
        bw = np.argmin(np.abs(a[val]-b[dup == val]))
        tt = dup == val
        ret_b[idx] = np.where(tt == True)[0][bw]
    return np.column_stack((uni, ret_b))
like image 133
RexFuzzle Avatar answered Sep 09 '25 02:09

RexFuzzle