Does anyone know both the expected running times and worst case running times for different implementations of std::nth_element?  I use this algorithm nearly every day.
I'm specifically interested in the STL versions shipping with the recent Microsoft Compilers, but any information on this topic is helpful.
Please note that this is not a duplicate of this question. I understand which algorithms exist, but I'm interested in which implementations use which algorithms.
For background, there are well-known algorithms to do this. One is O(n) average case and O(n log n) worst case, one is O(n) worst case but slow in practice (median of medians). Also note that there is talk of interesting implementation strategies to get worst-case O(n) running times that are fast in practice. The standard says that this must be at worse O(n) average time.
The nth_element function is typically implemented using Introselect, which brings the average complexity down to O(n).
If you only want to move the n-th element into place, instead of recursing into both sub-arrays, you can tell in every step whether you will need to descend into the left or right sub-array. (You know this because the n-th element in a sorted array has index n so it becomes a matter of comparing indices.)
C++ Algorithm nth_element() function is used to sort the elements between the first and nth element in ascending order and element between nth and last are not sorted. However, no element in between nth and last is smaller than an element between first and nth element.
Here, we have passed greater () as binary function, so now nth element will be the one which should be at the nth place if we sort the given array in descending order, so first n elements will be the first n largest elements. It can be used to find the median of the elements given.
The expected running time is O(N) The worst case running time for most implemententations is O(N * N) because most implementations use QuickSelect and it could be that QuickSelect runs into bad partitions. That is true for Microsoft VS2008, VS2010 & VS2012.
Now with the new ISO C++ 2011 standard, the complexity for std::sort has been tightened up - it is guaranteed to be O(N * log N) and has no worse case as David Musser's IntroSort is used: - use QuickSort and if parts of the array experience bad partitioning, swap to heapsort.
Ideally exactly the same should apply std::nth_element but the ISO C++ 2011 standard has not tightened up the complexity requirements. So std::nth_element could be O(N * N) in the worst case. This could be because in David Musser's original paper (see here) he did not mention what algorithm should be swapped to if QuickSelect goes bad.
In the worst case, the median-of-medians using groups of 5 could be used (I have seen a paper the recommended groups of 7 but cannot find it). So a quality implementation of std::nth_element could use QuickSelect and swap to median-of-medians if partitioning goes bad. This would guarantee O(N) behaviour. QuickSelect can be improved by using sampling making the worst case unlikely but not impossible.
The implementation in GCC 4.7 uses introspective selection by David Musser (here you have his paper giving details on introsort and introselect). According to those documents the worst-case execution time is O(n).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With