I was at a interview for a High Frequency Trading firm. They asked me for an algorithm of O(n):
n points in spaceO(1),3.I Googled and did some research. There is a O(n) algorithm (Best circle placement by Chazelle from Princeton University), but it is kind of beyond my level to understand and put it together to explain it in 10 mins. I already know O(n^2) and O(n^3) algorithms.
Please help me find an O(n) algorithm.
I guess the integer coordinate constraint simplifies the problem notably. This looks like O(n) to me:
-Make a dictionary of all integer points in space and set the entries to 0.
-For each datapoint find the integer points that are within radius 3, and add 1 to the corresponding entries of the dictionary. The reason for doing this is that the set of points that can be the centers of a circle in which that particular datapoint is inside is the integer restriction of a circle with the same radius around that datapoint. The search could be done over all points lying on a square of length 6 (thought not all points need to be evaluated explicitly as these inside the inscribed hypercube are inside for sure).
-Return the integer point corresponding to the maximum value of the dictionary, ie the center for which most datapoints are inside the circle.
Edit: I guess some code is better than explanations. This is working python with numpy and matplotlib. Shouldn't be too difficult to read:
# -*- coding: utf-8 -*-
"""
Created on Mon Mar 11 19:22:12 2013
@author: Zah
"""
from __future__ import division
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
from collections import defaultdict
import timeit
def make_points(n):
    """Generates n random points"""
    return np.random.uniform(0,30,(n,2))
def find_centers(point, r):
    """Given 1 point, find all possible integer centers searching in a square 
    around that point. Note that this method can be imporved."""
    posx, posy = point
    centers = ((x,y) 
        for x in xrange(int(np.ceil(posx - r)), int(np.floor(posx + r)) + 1)
        for y in xrange(int(np.ceil(posy - r)), int(np.floor(posy + r)) + 1)        
        if (x-posx)**2 + (y-posy)**2 < r*r)
    return centers
def find_circle(n, r=3.):
    """Find the best center"""
    points = make_points(n)
    d = defaultdict(int)
    for point in points:
        for center in find_centers(point, r):
            d[center] += 1
    return max(d , key = lambda x: d[x]), points
def make_results():
    """Green circle is the best center. Red crosses are posible centers for some
    random point as an example"""
    r = 3
    center, points = find_circle(100)
    xv,yv = points.transpose()
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.set_aspect(1)
    ax.scatter(xv,yv)
    ax.add_artist(plt.Circle(center, r, facecolor = 'g', alpha = .5, zorder = 0))
    centersx, centersy  = np.array(list(find_centers(points[0], r))).transpose()
    plt.scatter(centersx, centersy,c = 'r', marker = '+')
    ax.add_artist(plt.Circle(points[0], r, facecolor = 'r', alpha = .25, zorder = 0))
    plt.show()
if __name__ == "__main__":
    make_results()
Results:
 Green circle is the best one, and the red stuff demonstrates how centers are picked for some random point.
Green circle is the best one, and the red stuff demonstrates how centers are picked for some random point.
In [70]: %timeit find_circle(1000)
1 loops, best of 3: 1.76 s per loop
In [71]: %timeit find_circle(2000)
1 loops, best of 3: 3.51 s per loop
In [72]: %timeit find_circle(3000)
1 loops, best of 3: 5.3 s per loop
In [73]: %timeit find_circle(4000)
1 loops, best of 3: 7.03 s per loop
In my really slow machine. Behaviour is clearly linear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With