Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted quick-union with path compression algorithm: time complexity analysis

I'm learning "Weighted quick-union with path compression" algorithm for an union/find structure. The Princeton edu site explained detailedly the algorithm. And here is the implementation in Java:

public class WQUPC {
  private int[] id;
  private int[] sz;
  public WQUPC(int N) {
    id = new int[N];
    sz = new int[N];
    for (int i = 0; i < N; i++) {
      id[i] = i;
      sz[i] = 1;
    }
  }

  int root(int i) {
    while (i != id[i]) {
      id[i] = id[id[i]];
      i = id[i];
    }
    return i;
  }

  boolean connected(int p, int q) { return root(p) == root(q); }

  void union(int p, int q) {
    int i = root(p);
    int j = root(q);
    if (sz[i] < sz[j]) {
      id[i] = j;
      sz[j] += sz[i];
    } else {
      id[j] = i;
      sz[i] += sz[j];
    }
  }
}

But just like the website mentions about its performance:

Theorem: Starting from an empty data structure, any sequence of M union and find operations on N objects takes O(N + M lg* N) time.

• Proof is very difficult.

• But the algorithm is still simple!

But I'm still curious about how comes the iterative logarithm lg*n. How is it derived? Can someone prove it or just explain it in an intuitive way?

like image 379
pjpj Avatar asked Feb 15 '26 04:02

pjpj


2 Answers

To begin with, your question has a slight mistake: the complexity with just path compression is only O(m log(n)) (without the iterated log). See, for example, exercise 21-4.4 in Introduction To Algorithms. In fact, you block

    if (sz[i] < sz[j]) {
      id[i] = j;
      sz[j] += sz[i];
    } else {
      id[j] = i;
      sz[i] += sz[j];
    }

does union by rank.

With both union by rank and path compression, though, the expression you used can be proved easily (much more easily than the inverse Ackerman one). The proof is based on three points:

  1. On each leaf-root path, the rank of each node is increasing. This in fact relies on union by rank, BTW, and, with it, it is very easy to show.

  2. If the root of a tree has rank r, the tree has at least 2r nodes. This can be shown by induction.

Based on 2., it's possible to show

  1. The maximal number of nodes with rank r is at most n / 2r.

The rest of the proof now follows by a clever arrangement of the worst possible way the ranks could be organized, which still shows that not too many are bad. For further details, see the Wikipedia entry on this.

like image 178
Ami Tavory Avatar answered Feb 17 '26 20:02

Ami Tavory


As I recall, the proof had to do with amortizing the cost of the path compression over a set of searches. Shallow searches are cheap, and don't incur the cost of much path compression; deep searches cost a lot for that search and the path compression, but the path compression makes subsequent searches much cheaper on average.

like image 35
Scott Hunter Avatar answered Feb 17 '26 20:02

Scott Hunter



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!