HashSet<T>
Specifications
This page describes the interface for the HashSet
class template.
An instance of HashSet<T>
will store values of the type T
(as specified by the user) in a hash table that uses separate chaining and dynamic resizing.
Your implementation must support all the functionality described on this page, including the specified complexity requirements. You may not alter the interface (by changing names or adding/removing public functions).
You may add private helper member functions, but these are not part of the interface and not described here.
User-Provided myhash()
Function
The HashSet
class will assume that the user of the hash table has provided a myhash
function for the type stored in the table. For example, if a uses wants to store Cow
objects in a HashSet<Cow>
, they will have also written a global function size_t myhash(const Cow& c)
that provides hash values for Cow
s using the full range of the size_t
type.
In other words, your HashSet
should not implement its own hash function. When it needs to hash a key, it should simply call myhash()
on that key, and assume that this function will be defined.
This design allows a user to define their own hash function for their particular problem and their own selected key type.
Allowable Key Types
You may not require anything of the type of keys stored in the hash table beyond the following:
- The key type should have a copy constructor, assignment operator, and equality operator; and,
- The user should have defined a
myhash()
function for that type (as described above). - If the user wants to print out the hash table, their type must also be printable.
In other words, your HashSet
class template may not perform any operation on a key other than copy construction, assignment, equality comparison, myhash()
, and (optionally), the <<
operator.
HashSet
Interface
The class HashSet
, on some type T
(i.e., HashSet<T>
) provides the following class-wide (a.k.a. static
) compile-time (a.k.a. constexpr
) constants:
DEFAULT_NUM_BUCKETS
, asize_t
specifying the default number of buckets to use if not specified by the user.DEFAULT_MAX_LOAD_FACTOR
, adouble
, specifying the maximum allowed load factor for the hash table (going beyond this value will trigger an expansion of the hash table to reduce it).
An instance of HashSet
, on some type T
(i.e., HashSet<T>
), will provide the following operations:
- An
explicit
parameterized constructor that:- Takes two arguments. Each argument has a default value that will be used if the argument is not specified.
- a
size_t
indicating the initial number of buckets in the hash table.- Defaults to
DEFAULT_NUM_BUCKETS
.
- Defaults to
- a
double
indicating the maximum load factor.- Defaults to
DEFAULT_MAX_LOAD_FACTOR
.
- Defaults to
- a
- Because the arguments are optional, this constructor is also the default constructor.
- Takes two arguments. Each argument has a default value that will be used if the argument is not specified.
- A
swap
function that swaps the contents of twoHashSet<T>
objects that- Takes one argument, the
HashSet<T>
to swap with; - Does not return a value;
- Requires \(\Theta(1)\) time.
- Takes one argument, the
- A destructor.
- A
size()
member function that- Takes no arguments;
- Returns the number of values in the hash table, as a
size_t
; - Requires \(\Theta(1)\) time;
- Can be called on a read-only set (i.e.,
const
at the end).
- An
insert()
member function that- Takes a
T
by constant reference; - Doesn’t return anything (but inserts the item into the hash table);
- It is okay to call
insert
with aT
value that is already in the hash table. In that case the item is not inserted again.
- It is okay to call
- Takes \( \Theta(1) \) amortized expected time under the probability model discussed in the lesson, with a \( \mathrm{O}(n) \) time bound.
- Takes a
- An
exists()
member function that- Takes a
T
by constant reference; - Returns a
bool
indicating whether the item was found in the hash table; - Takes \( \Theta(1) \) amortized expected time under the probability model discussed in the lesson, with a \( \mathrm{O}(n) \) time bound;
- Can be called on a read-only hash table (i.e.,
const
at the end).
- Takes a
- A
printToStream()
member function that- Takes an
ostream&
; - Prints the hash table out to that stream (the required format is descibed later);
- Returns the same
ostream&
it was passed; - Can be called on a read-only hash table (i.e.,
const
at the end).
- Takes an
In addition, HashSet
should provide the following member functions to control the way the hash table stores its data.
- A
maxLoadFactor()
member function that- Takes a
double
representing a new value for the maximum allowed load factor; - Does not return anything;
- Takes \( \Theta(1) \) time. The hash table is not rehashed by this call.
- Takes a
- A
rehash()
member function that- Does not take any arguments;
- Does not return anything;
- Reallocates the hash table (if necessary) such that the load factor now no more than half the maximum allowed load factor;
- Takes \( \Theta(n) \) time.
In addition, HashSet
should provide the following member functions, which will provide statistics about the hash table. These must all run in \( \Theta(1) \) time.
- A
buckets()
member function that- Takes no arguments;
- Returns the number of buckets in the hash table, as a
size_t
; - Takes \( \Theta(1) \) time.
- A
loadFactor()
member function that- Takes no arguments;
- Returns the current load factor of the hash table, as a
double
;- An empty hash table is defined to have a load factor of zero.
- Takes \( \Theta(1) \) time.
- A
reallocations()
member function that- Takes no arguments;
- Returns the number of times the hash table has reallocated its underlying array, as a
size_t
;- The hash table must be reallocated when:
- An item is inserted and would cause the load factor to go above the maximum load factor, or
- A user calls
rehash
and the current load factor is more than half of the maximum load factor.
- The hash table must be reallocated when:
- Takes \( \Theta(1) \) time.
- A
collisions()
member function that- Takes no arguments;
- Returns the number of times a key has been inserted into a non-empty bucket since the last time the table was reallocated, as a
size_t
;- An empty hash table is defined to have a load factor of zero.
- Takes \( \Theta(1) \) time.
- A
maximal()
member function that- Takes no arguments;
- Returns the largest number of steps ever taken to find a key since the last time the table was reallocated, as a
size_t
;- The assignment description will give more details about this function.
- Takes \( \Theta(1) \) time.
-
A
showStatistics()
member function that- Takes an
ostream&
; -
Prints statistics about the hash to that stream, in the format
n expansions, load factor f, c collisions, longest run l
(where
n
,c
, andl
are integers, andf
is adouble
), followed by a newline. -
Returns the same
ostream&
it was passed. - Note that this function is provided.
- Takes an
You are not required to provide a copy constructor, an assignment operator, an operator==
function, an iterator, or an erase()
function. You are also not required to provide an iterator. You should disable the copy constructor and assignment operator in the class definition.
Top-Level Functions
You should also provide a top-level (not-member) function operator<<
that defines how to print a HashSet<T>
. It takes as arguments an ostream&
and a const HashSet<T>&
; it returns an ostream&
. That operator will call the public printToStream()
member function of the HashSet<T>
class.
Private Encoding: Separate Chaining
We require your HashSet
class template to use separate chaining as its encoding and to represent each bucket using a std::forward_list<T>
, where T
is the type of the keys.
The std::forward_list
data structure is a bare-bones, singly linked list, which is more space efficient than a std::list
as a std::list
is doubly linked and stores its size as a member variable.
The buckets of the hash table are stored in a dynamically allocated array. Each element is a std::forward_list<T>
embodying the chain that holds the keys in that bucket.
Notice that your array elements are
std::forward_list
objects, not pointers to forward lists. In other words, you make the whole array ofstd::forward_list
objects at once—you don't createstd::forward_list
objects individually and then store them in your array.
You will likely need other private member variables, but those are up to you.
(When logged in, completion status appears here.)