I'm looking for an efficient connection grouping (I'm not sure this is proper name..) algorithm or implementation by python.
For example, I have this nested list:
connection_data = [
...: ["A", "B", "C"],
...: ["B", "D"],
...: ["A", "C"],
...: ["E", "F"],
...: ["C", "D"],
...: ]
This data means each list in the nested list shows connections.
For example, the first connection ["A", "B", "C"]
means A
and B
and C
are having connection each other.
The nested list has multiple connections information.
I would like to calculate connection groupings from a nested list.
For example, when I have the upper connection_data
, I would like to get
grouped_connection = [
...: ["A", "B", "C", "D"],
...: ["E", "F"],
...: ]
Because, A
, B
, C
, D
have connections in these connection data in the connection_data
: ["A", "B", "C"], ["B", "D"], ["A", "C"], ["C", "D"]
, E
and F
have connection by ["E", "F"]
.
To summarize my questions:
Networkx provides an implementation of a union find datastructure [1] [2] that solves this problem efficiently:
from networkx.utils.union_find import UnionFind
groups = [
["A", "B", "C"],
["B", "D"],
["A", "C"],
["E", "F"],
["C", "D"],
]
ds = UnionFind()
for gp in groups:
ds.union(*gp)
for s in ds.to_sets():
print(s)
# {'B', 'C', 'D', 'A'}
# {'E', 'F'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With