Comparisons: Pros and Cons

Empty buckets aren't necessarily “wasted space”—they make search faster.
- Particularly unsuccessful search, which may need to perform no comparisons at all to determine the item isn't in the table.

Pro: Simple and easy to implement
Pro: Tied for best unsuccessful-lookup expected comparisons.
Pro: No number-theoretic demands on the number of buckets (powers of two are fine, so successive doubling is easy).
Pro: Straightforward removal.
- We haven't talked about it, but just remove the key from its chain!
Con: Tied for worst successful lookup expected comparisons.
Con: Memory overhead of storing pointers for chains.
- Can be significant if the items being stored are small, but not a big deal if items stored are large.

Pro: Best expected lookup and insert performance under ideal conditions.
- As a result, we can have higher load factors/smaller tables with good expected comparisons.
Pro: Addresses any clustering concerns (better even than separate chaining).
Pro: Very space efficient for small objects, if we store them directly in the table.
Meh: Needs two independent values for hash and increment.
- But we can arrange things can actually come from the same hash function, so not a big deal.
Meh: In some variants, the number of buckets has to be prime (which makes successive doubling a bit more annoying).
- But we can use a power-of-two size and an odd increment instead.
Con: Relatively more sensitive to choice of hash functions.
Con: Removal is complicated.
- Can't just leave an empty bucket—that would mess up lookup!

Pro: Combats primary clustering.
Pro: Expected comparisons nearly as good as double hashing.
Pro: Very space efficient for small objects, if we store them directly in the table.
Con: Vulnerable to secondary clustering (things that start in the same bucket cluster together, just like separate chaining)
Con: Number of buckets needs to be prime (which makes successive doubling a bit more annoying).
Con: Requires \( \lambda \lt 0.5 \) to guarantee that an empty bucket will be found.
- (That means a lot of unused space in the table).
Con: Removal is complicated.

Pro: The easiest to implement open-addressing method.
Pro: Sometimes CPU caching makes it efficient to access nearby elements in an array, making probing faster.
Pro: No number-theoretic demands on the number of buckets (powers of two are fine, so successive doubling is easy).
Pro: Very space efficient for small objects, if we store them directly in the table.
Pro: Although it works best when \( \lambda \lt 0.5 \), it still works when \( \lambda \gt 0.5 \)—unlike quadratic probing, which could lock up.
Meh: Removal is the least complicated of all the open-addressing methods, but still more complex than separate chaining.
- You just “rehash” the rest of the cluster after the removed element—something that's impossible for techniques that break up clusters.
Con: Prone to primary clustering: Worst overall expected comparisons.
- Requires low load factors for good performance.
  - But when \( \lambda \lt 0.5 \), primary clustering is not that big a deal.

(When logged in, completion status appears here.)

CS 70