How does heapsort's big-O runtime relate to the number of comparisons made?

Question

I have read quite a bit on big-O notation and I have a basic understanding. This is a specific question that I hope will help me understand it better.

If I have and array of 100 integers (no duplicates, and randomly generated) and I use heapsort to sort it, I know that big-O notation for heapsort is n lg n. For n = 100, this works out to 100 × 6.64, which is roughly 664.

While I know this is the upper bound on the number of comparisons and my count can be less than 664, if I am trying to figure out the number of comparisons for a heap sorted array of 100 random numbers, it should always be less than or equal to 664?

I am trying to add counters to my heapsort to get the big-O comparison time and coming up with crazy numbers. I will continue to work it out, but wanted to just verify that I was thinking of the upper bound properly.

Thanks!

templatetypedef · Accepted Answer

Big-O notation does not give you an exact upper bound on a function's runtime - instead, it tells you asymptotically how the function's runtime grows. If a function has runtime O(n log n), it means that the function grows at roughly the same rate as the function f(n) = n log n. That means, for example, that the actual runtime could be 23 n log n + 17 n, or it could be 0.05 n log n. Consequently, you can't use the fact that heapsort is O(n log n) to count the number of comparisons made. You'd need a more precise analysis.

It just so happens that you can get a very precise analysis of heapsort, but it requires you to do a more meticulous analysis of the algorithm. You can show, for example, that the number of comparisons required to call make-heap is at most 3n, and that the number of comparisons made during the repeated calls to extract-min is at most 2n log (n + 1) (the binary heap has log (n + 1) layers, and during each of the n extract-max's, at each layer at most two comparisons are made). This gives an overall number of comparisons upper-bounded by 2n log (n + 1) + 3n.

The famous Ω(n log n) sorting barrier can be used to get a matching lower bound. Any comparison-based sorting algorithm, of which heapsort is one, must make at least log n! = n log n - n + O(log n) (this is Stirling's approximation) comparisons on average, and so heapsort is required to make at least n log n - n comparisons in the worst-case. (Note that this is actually n log n, not some constant multiple of n log n. You can read over the proof of the Ω(n log n) barrier for why this is.)

Hope this helps!

svk · Answer

Let's say that you know that your algorithm requires O( n log_2 n ) comparisons when sorting n elements.

This tells you the following, and only the following: there exists a constant number C such that, as n approaches infinity, the algorithm never requires more than C * n * log_2 n comparisons.

It does not tell you anything about the specific number of comparisons that might be required for any value of n -- it tells you about how the number of comparisons required grows in the limit as the number of elements grows.

You can not use the Big-O complexity of your sorting algorithm to prove anything about the behaviour of a particular finite n, such as 100 elements. Sorting 100 elements might require 64 comparisons, or 664, or 664 million. The latter is clearly not reasonable, but Big-O simply provides no information here.

How does heapsort's big-O runtime relate to the number of comparisons made?

Tags:

algorithm

big-o

sorting

heapsort

wfsteadman

2 Answers

templatetypedef

svk

Recent Activity

Donate For Us

How does heapsort's big-O runtime relate to the number of comparisons made?

Tags:

algorithm

big-o

sorting

heapsort

wfsteadman

2 Answers

templatetypedef

svk

Related questions

Recent Activity

Donate For Us