HEAP SORT

Now that we know about binary heaps, we might as well use them for something!
It turns out that you can use heap operations to make a cool sorting algorithm!

HEAP-SORT

Say we have an array arr of n arbitrary numbers that we want to sort.

Here is the HEAP-SORT algorithm:

HEAP-SORT(arr) algorithm:
    MAX-HEAPIFY(arr)          // Turn the array into a max heap (max at the root)
    for (i = 1; i ≤ n; ++i)
        x = GET-MAX(arr)       // Remember this is a max heap
        POP-MAX(arr)
        arr[n − i] = x               // This spot is no longer part of the heap

At the end of this algorithm, the array that used to hold the heap will be filled with the numbers in sorted order.

Here's a cool HEAP-SORT visualizer: https://www.cs.usfca.edu/~galles/visualization/HeapSort.html

HEAP-SORT Analysis

So how long does HEAP-SORT take?

The first step is MAX-HEAPIFY. We just decided that we can do that in \( \Theta(n) \) time.

Next we have \( n \) calls to POP-MAX. But every time we call it, the heap gets smaller!

So the first time takes \( \mathrm{O}( \log{n} ) \) time. The second time takes \( \mathrm{O}( \log{ ( n -1 ) } \)) time. Adding them up, we have…

\[ \mathrm{O}\bigl( \log{ n } + \log{ (n - 1 ) } + \log{ ( n - 2 ) } + \cdots{} + \log{ ( 2 ) } + \log{ ( 1 ) } \bigr). \]

Surely we can agree that every term in the sum is upper bounded by \( \log{n} \), right? So it's true that the complexity is

\[ \mathrm{O}\left( \log{n} + \log{n} + \cdots{} + \log{n} \right) = \mathrm{O}( n \log{n} ). \]

We might be worried that this is a loose bound, but it turns out that it's possible to prove that the worst-case complexity for any algorithm that sorts an arbitrary array of numbers is \( \Omega( n \log{n} ) \). That is, the worst-case complexity can't be better than linearithmic. So this bound is tight, and HEAP-SORT has the best possible worst-case time!

The best case for HEAP-SORT is when all the items are equal, in which case heapify still takes \( \Theta(n) \) time, but each POP-MAX call takes \( \Theta(1) \), resulting in \( \Theta(n) \) time overall. But this unusual case is actually the only way we can coax \( \Theta(n) \) performance out of HEAP-SORT, otherwise it's always \( \Theta(n \log n) \).

Running Code

I get the ideas, but it'd all be more concrete if I could see some running code.
Yes, I think I might understand it better if I saw how you code some of these things.
MORE code!!!
Okay.

A `Heap<T>` Data Structure

Here's a full implementation of a Heap<T> class.

Open Code in OnlineGDB
- Check out the heap implementation in heap-private.hpp
- Examine and run the example code that uses the heap in main.cpp.

The C++ Standard Library's Heap Facilities and Heap-Sorting

And this code shows the heap-sort algorithm (using C++'s built-in heap-making facilities).
For best timing results with this code, click on the gear icon in the top corner and add -O3 to the compiler options. (That's the letter “O” for “optimize” not a zero there.)

This example code shows an in-place heap sort using C++'s heap facilities.

Open Code in OnlineGDB

Meh. I'm kinda disappointed. HEAP-SORT is all like “I'm the fastest because I'm \( \Theta(n \log n) \)”, and then it's slower than std::sort. So I guess we know std::sort isn't heap sort and HEAP-SORT is garbage.
We never said HEAP-SORT was always the fastest, just that it gives a good worst-case asymptotic guarantee.
What algorithm does std::sort use?
It depends on the particular C++ standard library, but typically they use robust industrial-strength sorting algorithms, which are often a hybrid of multiple familiar algorithms, including HEAP-SORT.
If I were making std::sort I'd use pdqsort, and for std::stable_sort I'd use timsort. And this article has a great discussion of the engineering that goes into writing and maintaining std::sort.
Okay… There's certainly lots down the “sorting algorithms” rabbit hole.
What's the difference between std::sort and std::stable_sort?
std::stable_sort leaves equal elements in the order it found them, whereas std::sort can leave them in a different order.
So if you want to, say, a have list of students students sorted by dorm (most importantly), then class year, then name, you'd want to use std::stable_sort, and you'd first sort by name, then class year, then dorm. If you used std::sort you couldn't do that because each sort would totally destroy the order from the previous sort.

(When logged in, completion status appears here.)

Previous Page Next Page

CS 70

Warning: Missing Prerequisites

HEAP SORT

HEAP-SORT

HEAP-SORT Analysis

Running Code

A `Heap<T>` Data Structure

The C++ Standard Library's Heap Facilities and Heap-Sorting

Instructor Controls

Warning: Missing Prerequisites

HEAP SORT

HEAP-SORT

HEAP-SORT Analysis

Running Code

A Heap<T> Data Structure

The C++ Standard Library's Heap Facilities and Heap-Sorting

Instructor Controls

A `Heap<T>` Data Structure