Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Producing a nearest neighbour distance map for a set of points

I have a list of (x,y) points. I'm trying to produce a plot of the distance to each point as an image, so my naive function looks like:

from scipy.spatial.distance import cdist
from numpy import *

def findDistances(imageSize, points):
    image = zeros(imageSize)
    for x in range(0,imageSize[1]):
        for y in range(0,imageSize[0]):
            image[y,x] = np.min(cdist(array([[x,y]]), points))
    return image

This function is fine, it gives what I want (see picture below). This takes about 100s for a ~1MP image which is fine for something that only ever needs to be done once, but I assume there's room for improvement. I also tried:

image[y,x] = min(linalg.norm(array([[x,y]])- points, axis=1))

Which runs in a comparable time - makes sense, they're probably doing something similar under the hood, though I haven't checked the source to be sure.

I had a look at Scipy's cKDTree, with:

from scipy.spatial import cKDTree

def findDistancesTree(imageSize, points):
    tree = cKDTree(points)
    image = zeros(imageSize)
    for x in range(0,imageSize[1]):
        for y in range(0,imageSize[0]):
            image[y,x] = tree.query([x,y])[0]
    return image

A call to tree.query takes around 50µs (from %timeit) and in reality it takes 70-80s to generate a 1MP distance map. 20s improvement is better than a kick in the groin, but I don't know if I can improve it further.

%timeit np.min(linalg.norm(array([[random.random()*1000,random.random()*1000]]) - points, axis=1))
%timeit np.min(cdist(array([[random.random()*1000,random.random()*1000]]), points))
%timeit tree.query(array([[random.random()*1000,random.random()*1000]]))

10000 loops, best of 3: 82.8 µs per loop
10000 loops, best of 3: 67.9 µs per loop
10000 loops, best of 3: 52.3 µs per loop

In terms of complexity, brute force should be something like O(NM) where N is the number of pixels in the image and M is the number of points to check against. I was expecting more of a speedup as the search time should be more like O(N log(M)) for N pixels with a log(M) look-up time for each one - am I missing something?

Imgur

like image 458
Josh Avatar asked Jan 20 '26 07:01

Josh


1 Answers

This sounded like a problem where even a basic brute force implementation with a GPU would give good improvement. So i gave it a shot. And the improvement was quite good.

I did my testing using pyopencl.

import pyopencl as cl 
import numpy as np 

def findDistances_cl(imageSize, points):
    #Boilerplate opencl code
    ctx = cl.create_some_context()
    queue = cl.CommandQueue(ctx)

    f = open('nn.cl', 'r')
    fstr = "".join(f.readlines())
    program = cl.Program(ctx, fstr).build()

    #Creating buffers for the opencl kernel
    mf = cl.mem_flags
    img = np.empty(imageSize, dtype=np.float32)
    x_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=points[:,0].astype(np.float32))
    y_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=points[:,1].astype(np.float32))
    n_points = cl.Buffer(ctx, mf.READ_ONLY | mf.USE_HOST_PTR, hostbuf=np.array([len(points)],dtype=np.int))
    img_buf = cl.Buffer(ctx, mf.WRITE_ONLY, img.nbytes)

    #Run the kernel
    exec_evt = program.nn(queue, img.shape, None, img_buf, x_buf, y_buf, n_points)
    exec_evt.wait()
    #read back the result
    cl.enqueue_read_buffer(queue, img_buf, img).wait()

    return img

the opencl kernel (nn.cl)

__kernel void nn(__global float *output, __global constant float *x , __global constant float *y, __global constant int *numPoints)
    {
        int row = get_global_id(0);
        int col = get_global_id(1);

        int numRows = get_global_size(0);
        int numCols = get_global_size(1);

        int gid = col+ row*numCols;

        float minDist = numRows * numCols;

        for(int i = 0; i < *numPoints; i++){
          minDist = min(minDist, sqrt((row - y[i])*(row - y[i]) + (col - x[i])*(col - x[i])));
        }
        output[gid] = minDist;
    }

Timing results.

imageSize = [1000, 1000]
points = np.random.random((1000,2))*imageSize[0]

In [4]: %timeit findDistancesTree(imageSize, points)
1 loops, best of 3: 27.1 s per loop

In [7]: %timeit findDistances_cl(imageSize, points)
10 loops, best of 3: 55.3 ms per loop

About 490x speed improvement. If you need more speed there are more advanced algorithms out there for doing nearest neighbors with GPUs.

like image 166
M4rtini Avatar answered Jan 22 '26 20:01

M4rtini



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!