Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient connection grouping algorithm or implementation in python

I'm looking for an efficient connection grouping (I'm not sure this is proper name..) algorithm or implementation by python.

For example, I have this nested list:

connection_data = [
   ...:     ["A", "B", "C"],
   ...:     ["B", "D"],
   ...:     ["A", "C"],
   ...:     ["E", "F"],
   ...:     ["C", "D"],
   ...:     ]

This data means each list in the nested list shows connections. For example, the first connection ["A", "B", "C"] means A and B and C are having connection each other. The nested list has multiple connections information.

I would like to calculate connection groupings from a nested list. For example, when I have the upper connection_data, I would like to get

grouped_connection = [
   ...:     ["A", "B", "C", "D"],
   ...:     ["E", "F"],
   ...:     ]

Because, A, B, C, D have connections in these connection data in the connection_data: ["A", "B", "C"], ["B", "D"], ["A", "C"], ["C", "D"], E and F have connection by ["E", "F"].

To summarize my questions:

  1. What is this type of problem called in general?
  2. I think I can implement a many for-loop based solver. But is there any efficient algorithm or implementation in python for this type of problem?
like image 479
Atsushi Sakai Avatar asked Oct 18 '25 17:10

Atsushi Sakai


1 Answers

Networkx provides an implementation of a union find datastructure [1] [2] that solves this problem efficiently:

from networkx.utils.union_find import UnionFind

groups = [
    ["A", "B", "C"],
    ["B", "D"],
    ["A", "C"],
    ["E", "F"],
    ["C", "D"],
]

ds = UnionFind()
for gp in groups:
  ds.union(*gp)
for s in ds.to_sets():
  print(s)

# {'B', 'C', 'D', 'A'}
# {'E', 'F'}
like image 174
hilberts_drinking_problem Avatar answered Oct 20 '25 08:10

hilberts_drinking_problem