I have a list of coordinates that form axis-aligned 2D boxes (axis-oriented/iso-oriented rectangles, rects).
I want to see how many boxes intersect each other without repeating it. For example, I have boxes A, B, and C. If A intersects with B, B intersecting with A will still count as one intersection instead of two separate ones.
The way I'm thinking is to grab one box, compare it to all the rest and then throw it out since the comparison is done and I want no repeats. For example going back to A, B and C. If I'm on A and it intersects B and not C I have made its round and hence will not need to keep it in the loop. Hence once I'm on B it will not recheck A.
I can't think of a faster method. This is akin to finding duplicates using linear search and I think the worst-case complexity will be O(n^2). I don't see how to use binary search as it is possible that there are multiple intersections. I've been also looking at the sorting algorithm but I need one that won't go back and check the older values again.
You can solve this in O(n log n)
. Since you're trying to solve a static intersection problem in two dimensions, a common trick is to transform the problem into a dynamic intersection problem in one dimension (i.e., one space dimension becomes the time dimension).
Suppose you have a closed rectangle with lower left corner (x1, y1)
and upper right corner (x2, y2)
. This rectangle becomes two "interval events":
Interval [y1, y2] inserted at time x1
Interval [y1, y2] removed after time x2
Transform all your rectangles into events in this way, and then sort by time (breaking ties with insertions coming before removals).
Now, you need a data structure that lets you add and remove intervals [A, B]
, and also query the number of intervals in the data structure intersecting [A, B]
. Then you process the "interval events" in the sorted order, but keep a running sum of how many current intervals intersect [A, B]
before inserting each interval [A, B]
.
One data structure to do this in O(log n)
time per operation is with two balanced binary search trees: one holding beginning-points of intervals, and the other holding ending-points of intervals. You'll also need to augment each BST to be an Order Statistic Tree, to quickly query the number of points less than or equal to a certain value that are in the tree.
Then, finding how many intervals currently in your data structure intersect an arbitrary interval [A, B]
is simple; that count is:
#(Intervals intersecting [A, B]) =
#(values in the beginning-points tree that are <= B)
- #(values in the ending-points tree that are < A)
which you can check is correct from the definition of two intervals intersecting: neither interval starts after the other one has ended.
You could also replace the order statistic tree with a data structure for prefix-sums, like a Fenwick tree, which requires much less code for the same complexity.
Sample implementation of kcsquared's algorithm. You didn't specify a language, so I chose Python since I'm familiar with it and it's short readable code. Rectangles have lower left coordinate (x,y) and upper right coordinate (X,Y).
def intersections(rects):
events = []
for A in rects:
events.append((A.x, 'insert', A.y, A.Y))
events.append((A.X, 'remove', A.y, A.Y))
intersections = 0
ys = SortedList()
Ys = SortedList()
for x, op, y, Y in sorted(events):
if op == 'insert':
intersections += ys.bisect_right(Y) - Ys.bisect_left(y)
ys.add(y)
Ys.add(Y)
else:
ys.remove(y)
Ys.remove(Y)
return intersections
Tests on lists of 5,000 random rectangles against an O(n2) reference implementation, showing the numbers of intersections and runtimes:
124257 5465 ms reference
124257 127 ms intersections
121166 5444 ms reference
121166 124 ms intersections
118980 5435 ms reference
118980 124 ms intersections
With 10,000 rectangles:
489617 22342 ms reference
489617 292 ms intersections
489346 22491 ms reference
489346 296 ms intersections
489990 22302 ms reference
489990 290 ms intersections
Full code:
def reference(rects):
intersections = 0
for A, B in combinations(rects, 2):
if A.X >= B.x and A.x <= B.X and A.Y >= B.y and A.y <= B.Y:
intersections += 1
return intersections
def intersections(rects):
events = []
for A in rects:
events.append((A.x, 'insert', A.y, A.Y))
events.append((A.X, 'remove', A.y, A.Y))
intersections = 0
ys = SortedList()
Ys = SortedList()
for x, op, y, Y in sorted(events):
if op == 'insert':
intersections += ys.bisect_right(Y) - Ys.bisect_left(y)
ys.add(y)
Ys.add(Y)
else:
ys.remove(y)
Ys.remove(Y)
return intersections
from random import randint, randrange
from itertools import combinations
from timeit import default_timer as timer
from sortedcontainers import SortedList
from collections import namedtuple
Rect = namedtuple('Rect', 'x X y Y')
for _ in range(3):
rects = [Rect(x, x + randint(1, 100), y, y + randint(1, 100))
for _ in range(5000)
for x, y in [(randrange(1000), randrange(1000))]]
for func in reference, intersections:
t0 = timer()
result = func(rects)
t1 = timer()
print(result, '%4d ms' % ((t1 - t0) * 1e3), func.__name__)
print()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With