I have an irregular (non-rectangular) lon/lat grid and a bunch of points in lon/lat coordinates, which should correspond to points on the grid (though they might be slightly off for numerical reasons). Now I need the indices of the corresponding lon/lat points.
I've written a function which does this, but it is REALLY slow.
def find_indices(lon,lat,x,y):
    lonlat = np.dstack([lon,lat])
    delta = np.abs(lonlat-[x,y])
    ij_1d = np.linalg.norm(delta,axis=2).argmin()
    i,j = np.unravel_index(ij_1d,lon.shape)
    return i,j
ind = [find_indices(lon,lat,p*) for p in points]
I'm pretty sure there's a better (and faster) solution in numpy/scipy. I've already googled quite a lot, but the answer has so far eluded me.
Any suggestions how to efficiently find the indices of the corresponding (nearest) points?
PS: This question emerged from another one (click).
Based on @Cong Ma's answer, I've found the following solution:
def find_indices(points,lon,lat,tree=None):
    if tree is None:
        lon,lat = lon.T,lat.T
        lonlat = np.column_stack((lon.ravel(),lat.ravel()))
        tree = sp.spatial.cKDTree(lonlat)
    dist,idx = tree.query(points,k=1)
    ind = np.column_stack(np.unravel_index(idx,lon.shape))
    return [(i,j) for i,j in ind]
To put this solution and also the one from Divakar's answer into perspective, here are some timings of the function in which I'm using find_indices (and where it's the bottleneck in terms of speed) (see link above):
spatial_contour_frequency/pil0                :   331.9553
spatial_contour_frequency/pil1                :   104.5771
spatial_contour_frequency/pil2                :     2.3629
spatial_contour_frequency/pil3                :     0.3287
pil0 is my initial approach, pil1 Divakar's, and pil2/pil3 the final solution above, where the tree is created on-the-fly in pil2 (i.e. for every iteration of the loop in which find_indices is called) and only once in pil3 (see other thread for details). Even though Divakar's refinement of my initial approach gives me a 3x speed-up, cKDTree takes this to a whole new level with another 50x speedup! And moving the creation of the tree out of the function makes things even faster.
If the points are sufficiently localized, you may try directly scipy.spatial's cKDTree implementation, as discussed by myself in another post.  That post was about interpolation but you can ignore that and just use the query part.
tl;dr version:
Read up the documentation of scipy.sptial.cKDTree.  Create the tree by passing an (n, m)-shaped numpy ndarray object to the initializer, and the tree will be created from the n m-dimensional coordinates.
tree = scipy.spatial.cKDTree(array_of_coordinates)
After that, use tree.query() to retrieve the k-th nearest neighbor (possibly with approximation and parallelization, see docs), or use tree.query_ball_point() to find all neighbors within given distance tolerance.
If the points are not well localized, and the spherical curvature / non-trivial topology kicks in, you can try breaking the manifold into multiple parts, each small enough to be considered local.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With