Before You Start

You've implemented a set interface with a binary-search tree. The main operations for a set are INSERT and LOOKUP; ideally we want both of these to be efficient.

Meh, \( \mathrm{O}(\log{n}) \) seems kinda clunky.
Yeah, when I ran minispell from the homework, even with a perfectly balanced tree, with 34,831 words the height was 15, so there would be 16 string comparisons made in the worst case. That's not a huge number, but it's not nothing.
I want MORE speed!
Arrays are constant time to access an element, right? Couldn't we just use an array to store the words?

Using an Array-Backed List to Store a Set

Suppose that we instead implemented our set with an array-backed list (e.g., std::vector). We could do the following:

Can you remind me what an array-backed list is?
Fundamentally, we're storing the data in an array, but the array may have some empty space at the end. We keep track of how much of the array is actually used, and we can add new items to the end of the array. If we run out of space, we can allocate a new array that's bigger (usually twice as big) and copy everything over. The array is stored on the heap, which allows us to choose how big it is at runtime and allocate a new one to replace it when we need to.

Adding to an array-backed list with push_back() takes amortized \( \Theta(1) \) time because we might have to increase the capacity of the underlying array, which means we periodically have to copy all the data.

See, I knew it. Arrays are awesome. Amortized constant time.
And I think if we knew \( n \) ahead of time, we could avoid resizing and it'd just be constant time.

Needing to go all the way through is the problem, giving us \( \mathrm{O}(n) \) performance, way worse than red–black trees.
It'd be cool if it just knew the right place to look. Like I know where I buried a bone. Then it could just find it in constant time.
Yeah, but how could that possibly work? How could it know where to find it?
Well, actually…

CS 70

Warning: Missing Prerequisites

Before You Start

Using an Array-Backed List to Store a Set

Instructor Controls