CS 70

Phase 5: Rehashing

To meet our complexity guarantees when people don't know the size of their hash table ahead of time, we need to be able to rehash the hash table, creating a new, different-sized, underlying array of buckets and moving our items into that array.


Handy reference material:

Let's Go!

  • RHS Cow speaking

    There are some good tips in the Hints and Tips section; check them out before you start!

Following our usual development process, implement the following functions:

  • loadFactor()
  • maxLoadFactor()
  • rehash()
  • reallocations()

Also adjust the code for insert so that if an insert would cause the load factor to exceed the maximum, the rehash function is called automatically.

Planning Image

Add a planning image, Phase_5.jpg, to the Planning folder.

Hints and Tips

Writing loadFactor()

This function should return the current load factor of the table (i.e., the number of keys divided by the number of buckets).

The load factor of an empty table should be 0.

Careful! The load factor is a double value but dividing one integer by another will result in an integer value!

Writing rehash()

Probably the easiest way to write rehash() is to

  • Create a new (empty) HashSet with the desired properties
  • Go through all the buckets in the original hash set, taking the data out of the original hash set and inserting it into the new one, using an insertion function that doesn't check to make sure it's already there (because we know it isn't).
  • Finish up by swap()ping the new one and the old one.
  • The old HashSet will get cleaned up automatically when the variable goes away.

Testing

Because you can explicitly call rehash() and explicitly set the maximum load factor, either in the constructor or via maxLoadFactor, you can write tests that will work on any correct HashSet implementation.

(When logged in, completion status appears here.)