What's an Encoding?

If the interface specifies what, the encoding (a.k.a. representation) specifies how.

The encoding of a data structure consists of the design decisions about how the data will be organized: What structures/variables will be needed to store data and any bookkeeping values? What are the roles/meanings of these variables? What relationships must always hold between the pieces?

In C++, the encoding is primarily expressed in the private members of the class that represents the data structure.

The encoding gives concrete goals for the implementation. By specifying the encoding, we can clearly determine what effects each function should have in order to support the interface.

Practice: STRING-LIST

For any given interface there are typically many plausible encodings! Here are a couple of possible encodings for STRING-LIST. Given what you've already seen in your homework and the previous lesson, the encodings we'll explore should be pretty familiar.

STRING-LIST Encoded using Arrays

We could store the contents of the list using an array.

Let's imagine that the array is allocated with some initial size (called capacity).
The first element of the list is placed at index 0 of the array, the second at index 1 of the array, and so on.
If, at some point, there are more list elements than slots in the array, we'll have to create a new and bigger array and move everything to that new array.

Here is a schematic diagram of this encoding:

Terminology: This scheme is often called an array-backed list, but others call it a dynamic array, or a vector, depending on which community is talking about it.

STRING-LIST Encoded using a Linked List

We can also represent the list using a linked list, as you may have studied previously in CS 60 or 42.

Here we have a set of list nodes, where each node contains one list element and a link to the next node in the list.
The front node is called the head.
The back node is called the tail.

Here is a schematic diagram of this encoding:

Can someone remind me what nullptr is?
It's a C++ keyword for a special pointer value that doesn't point anywhere. (Follow this pointer to Pointers vs references in Week 4 to learn more.)
Here, nullptr is used to mark the end of the list.

Trade-Offs: Compare and Contrast

Why would we pick one or the other?
Good question! Every choice of encoding will have advantages and disadvantages.
Ultimately, the choice depends on matching the needs of your problem with the strengths of a particular choice of encoding.

Let's think a bit about the relative advantages and disadvantages of these two possibilities.

Heads up!

The following videos might contain some notation that you're unfamiliar with. This is because they come from an earlier iteration of CS70 where lessons were in a slightly different order. Thankfully, even if you just ignore the unfamiliar notation the videos should still work to get the core ideas across (and our "Key Points" recaps don't use the notation).

If it helps, you can also mentally replace the notation: anywhere you see/hear \(\Theta(1)\) can be mentally replaced with \(\mathrm{O}(1)\) or "constant time", and \(\Theta(n)\) can be replaced with \(\mathrm{O}(n)\) or "linear time".

Key Points

A linked list just needs to move a few pointers around to add a new node. Notice that the number of pointers being moved doesn't depend at all on the number of items in the list, so this operation takes constant time.
An array-based list would need to shift every element over by one to make room for the new item. The more items there are, the more shifts that need to happen, so this is a linear time operation!

Also, the array-backed list might need to resize the array.
Oh, that's true, too! But even if that happened, the cost of resizing the array is also directly proportional to the number of items. So overall the operation is still linear time.
Actually, the cost of those resize operations brings up some really interesting issues. We'll talk more about that at a later date.

Key Points

The array-backed list simply has to index into the array at the appropriate spot, which takes constant time!
The linked list would have to start at the head, and move to the requested location item-by-item. In the worst case we would need the final item (and thus have had to go throgh every item in the list). So this takes linear time.

Both techniques require linear time to perform an insert in general, but for very different reasons!

An array-backed list can find the location for the insert easily in constant time. But it then needs to shift everything over to make room for the new item. In the worst case (inserting near the beginning) that would mean shifting all of the items over to make room. In the best case (inserting near the end) it doesn't shift anything over. So the shifting takes linear time at worst, but could be better. Since we have a constant time operation followed by a (at worst) linear time operation, we say the overall insertion operation takes linear time.
A linked list has to start at the head and follow links to find the location for the insert. In the worst case (inserting near the end) that might mean traversing the whole list. In the best case (inserting near the beginning) it doesn't need to search at all. So this part takes linear time at worst, but could be better. Once the insertion location is found, actually inserting the new node can be done with just a few pointer reassignments, which takes constant time. Since we have a (at worst) linear time operation followed by a constant time operation, we say the overall insertion operation takes linear time.

So it seems like array-backed lists are good at accessing an arbitrary location but bad at inserting into arbitrary locations.
Seems right to me.
And linked lists are good at inserting new items, but bad at getting to arbitrary locations.
Yup, that's a pretty good summary!

(When logged in, completion status appears here.)

Previous Page Next Page

CS 70

Warning: Missing Prerequisites

What's an Encoding?

Practice: STRING-LIST

STRING-LIST Encoded using Arrays

STRING-LIST Encoded using a Linked List

Trade-Offs: Compare and Contrast

Heads up!

Key Points

Key Points

Instructor Controls