CS 70

Heapify

  • Dog speaking

    What if I had a bunch of items already in an array and wanted to make it into a heap?

  • LHS Cow speaking

    Any ideas?

  • Cat speaking

    We could go from front to back and let each item float up to the level where it belongs.

  • LHS Cow speaking

    Oh, cool. That would essentially be like inserting each item one at a time.

  • Duck speaking

    Or… could we go from the back to the front and let each item sink to where it belongs in its subtree?

  • LHS Cow speaking

    Sure! That would be sort of like the bottom of the tree is all heapish, and we keep sinking the things at the top into the lower heaps until its all one big heap.

  • RHS Cow speaking

    But which one is better?

  • LHS Cow speaking

    Hmm… let's check it out!

Min-Heapify (Rough) Analysis

Option 1 is something like

for (i = 0; i < n; ++i)
    FLOAT-UP(i)

In this version, when we're halfway through, the first half of the array will be a nicely made heap, but the second half will be a mess of unordered nodes.

Option 2 is something like this:

for (i = n − 1; i ≥ 0; −−i)
    SINK-DOWN(i)

In this version, when we're halfway through, the second half of the array will be a bunch of subheaps, but the first half will represent a part of the heap that only satisfies the structure property, not the order property.

Best-Case Analysis

What is the best-case complexity for both options?

In the best case, the array is already a heap! In that case, none of the nodes have to move, so FLOAT-UP / SINK-DOWN takes \( \Theta(1) \) time.

Thus we are doing \( n \times \Theta(1) \)-time operations: \( \Theta(n) \) time.

Worst-Case Analysis

Now let's imagine the worst case for each item.

  • For option 1, that means floating all the way to the root.
  • For option 2, that means sinking all the way to the bottom.

For nodes at different depths, float and sink take different amounts of effort.

  • To float the root up to the top takes no swaps at all!
  • To float a leaf up to the top takes \( \Theta( \log{n} ) \) swaps.
  • To sink the root to the bottom takes \( \Theta( \log{n} ) \) swaps.
  • To sink a leaf down to the bottom takes no swaps!

There are also different numbers of nodes at different depths.

  • At depth 0 there is 1 node (the root).
  • At depth 1 there are 2 nodes.
  • At depth 2 there are 4 nodes.
  • And so on…

Here, let's extrapolate a bit…

Depth Num. nodes Swaps to the top Swaps to the bottom
\( 0 \) \( 1 \) \( 0 \) \( \lfloor ( \log{n} ) \rfloor \)
\( 1 \) \( 2 \) \( 1 \) \( \lfloor ( \log{ ( n - 1 ) } ) \rfloor \)
\( 2 \) \( 4 \) \( 2 \) \( \lfloor ( \log{ ( n - 2 ) } ) \rfloor \)
\( \log{ ( n - 2 ) } \) \( n/8 \) \( \lfloor ( \log{ ( n - 2 ) } ) \rfloor \) \(2\)
\( \log{ ( n - 1 ) } \) \( n/4 \) \( \lfloor ( \log{ ( n - 1 ) } ) \rfloor \) \( 1 \)
\( \log{n} \) \( n/2 \) \( \lfloor ( \log{n} ) \rfloor \) \( 0 \)

So we just need to add up, for each level, how much effort each node takes times the number of nodes!

Would you rather have…

Option 1 (front to back)

$$ ( 0 \times 1 ) + ( 1 \times 2 ) + (2 \times 4 ) + \cdots{} + ( \lfloor \log{ ( n - 2 ) } \rfloor \times n/8 ) + ( \lfloor \log{ ( n - 1 ) } \rfloor \times n/4 ) + ( \lfloor \log{n} \rfloor \times n/2 ) $$

or

Option 2 (back to front)

$$ ( \lfloor ( \log{n} ) \rfloor \times 1 ) + ( \lfloor \log{ ( n - 1 ) } \rfloor \times 2 ) + ( \lfloor \log{ ( n - 2 ) } \rfloor \times 3 ) + \cdots{} + ( 2 \times n/8 ) + ( 1 \times n/4 ) + ( 0 \times n/2 )? $$

Roughly speaking, we can just compare the largest term in each, right?

  • Option 1: The largest term is the last one: \( \lfloor \log{ n } \rfloor \times n/2 = \Theta( n \log{n} ) \).
  • Option 2: The largest term is the second to last one: \( 1 \times n/4 = \Theta(n) \).
  • RHS Cow speaking

    With option 2, we can make a heap out of arbitrary keys in \( \Theta(n) \) worst-case time!

  • LHS Cow speaking

    Even though inserting them one at a time would take \( \Theta( n \log{n} ) \) in the worst case. Wow!

  • RHS Cow speaking

    The key insight is that fully half of the tree is made of leaves.

  • LHS Cow speaking

    So we should minimize the amount of work we do with the leaves. Hence, sinking everything down is better than floating everything up!

  • RHS Cow speaking

    Isn't that amazing??

  • Goat speaking

    Meh. I'm not convinced. That was not a proof.

  • LHS Cow speaking

    Well, good. You're right. That was not a proof.

  • RHS Cow speaking

    It's the end of the semester. We are all tired. So we've spare you the fully detailed derivation of these complexities.

  • Goat speaking

    Fine with me!

  • Rabbit speaking

    But if you want to go down the rabbit hole though, you can ask one of the Profs in person. They'll be happy to show you a nifty proof.

  • LHS Cow speaking

    If you keep studying CS, you'll probably encounter it at some point.

Conclusion

For MIN-HEAPIFY, we should do option 2 (back to front):

for (i = n − 1; i ≥ 0; −−i)
    SINK-DOWN(i)

The best-case complexity is: \( \Theta(n) \).

The worst-case complexity is: \( \Theta(n) \).

What is the most accurate and precise characterization of the complexity of MinHeapify?

Since both the best- and worst-case times are both \( \Theta(n) \), the complexity of MIN-HEAPIFY is \( \Theta(n) \): linear time in all cases, never better, never worse.

(When logged in, completion status appears here.)