Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to cluster lists with shared elements

This is a problem I can't currently find on Leetcode or StackOverflow. Lets say you have a set of lists of numbers or references as so:

[[1,2,3],[3,4,5],[6,7,8],[8,9],[10],[7]]

What is the fastest algorithm to merge these lists such that the output would be:

[[1,2,3,4,5],[6,7,8,9],[10]]

Many thanks.

like image 980
Erlich Avatar asked Dec 18 '25 15:12

Erlich


1 Answers

Prepare a graph from the lists as follows and find the connected components using depth-first search.

Each list gives rise to undirected edges that connected the first element to the others, e.g.,

[1,2,3] -> [(1,2), (1,3)]
[3,4,5] -> [(3,4), (3,5)]
[6,7,8] -> [(6,7), (6,8)]
[8,9]   -> [(8,9)]
[10]    -> []
[7]     -> []

Then run depth-first search to find connected components. In Python, everything goes something like this.

import collections
def merge(lsts):
    neighbors = collections.defaultdict(set)
    for lst in lsts:
        if not lst:
            continue
        for x in lst:
            neighbors[x].add(lst[0])
            neighbors[lst[0]].add(x)
    visited = set()
    for v in neighbors:
        if v in visited:
            continue
        stack = [v]
        component = []
        while stack:
            w = stack.pop()
            if w in visited:
                continue
            visited.add(w)
            component.append(w)
            stack.extend(neighbors[w])
        yield component
like image 100
David Eisenstat Avatar answered Dec 20 '25 10:12

David Eisenstat



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!