I have read quite a bit on big-O notation and I have a basic understanding. This is a specific question that I hope will help me understand it better.
If I have and array of 100 integers (no duplicates, and randomly generated) and I use heapsort to sort it, I know that big-O notation for heapsort is n lg n. For n = 100, this works out to 100 × 6.64, which is roughly 664.
While I know this is the upper bound on the number of comparisons and my count can be less than 664, if I am trying to figure out the number of comparisons for a heap sorted array of 100 random numbers, it should always be less than or equal to 664?
I am trying to add counters to my heapsort to get the big-O comparison time and coming up with crazy numbers. I will continue to work it out, but wanted to just verify that I was thinking of the upper bound properly.
Thanks!
Big-O notation does not give you an exact upper bound on a function's runtime - instead, it tells you asymptotically how the function's runtime grows. If a function has runtime O(n log n), it means that the function grows at roughly the same rate as the function f(n) = n log n. That means, for example, that the actual runtime could be 23 n log n + 17 n, or it could be 0.05 n log n. Consequently, you can't use the fact that heapsort is O(n log n) to count the number of comparisons made. You'd need a more precise analysis.
It just so happens that you can get a very precise analysis of heapsort, but it requires you to do a more meticulous analysis of the algorithm. You can show, for example, that the number of comparisons required to call make-heap is at most 3n, and that the number of comparisons made during the repeated calls to extract-min is at most 2n log (n + 1) (the binary heap has log (n + 1) layers, and during each of the n extract-max's, at each layer at most two comparisons are made). This gives an overall number of comparisons upper-bounded by 2n log (n + 1) + 3n.
The famous Ω(n log n) sorting barrier can be used to get a matching lower bound. Any comparison-based sorting algorithm, of which heapsort is one, must make at least log n! = n log n - n + O(log n) (this is Stirling's approximation) comparisons on average, and so heapsort is required to make at least n log n - n comparisons in the worst-case. (Note that this is actually n log n, not some constant multiple of n log n. You can read over the proof of the Ω(n log n) barrier for why this is.)
Hope this helps!
Let's say that you know that your algorithm requires O( n log_2 n ) comparisons when sorting n elements.
This tells you the following, and only the following: there exists a constant number C such that, as n approaches infinity, the algorithm never requires more than C * n * log_2 n comparisons.
It does not tell you anything about the specific number of comparisons that might be required for any value of n -- it tells you about how the number of comparisons required grows in the limit as the number of elements grows.
You can not use the Big-O complexity of your sorting algorithm to prove anything about the behaviour of a particular finite n, such as 100 elements. Sorting 100 elements might require 64 comparisons, or 664, or 664 million. The latter is clearly not reasonable, but Big-O simply provides no information here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With