CS 70

Saving Time by Not Moving/Copying Data

We've already seen how using references can help us avoid making needless copies of data. Now we'll look at how pointers can help us avoid some related problems.

An Example that Works Too Hard

Let's suppose we have an Elephant class with this definition, and some code that uses it.

class Elephant {
 public:
    Elephant();
    Elephant(const std::string& name, int age);
    Elephant(const Elephant& other);
    ~Elephant();
    Elephant& operator=(const Elephant& rhs);

    void remember(std::string event);
    void storyToStream(std::ostream& out);

    std::string name_ = "<unnamed>";
    int age_ = -1;

 private:
    size_t numMemories_ = 0;
    std::array<std::string, 100> memories_;
};

And this code that creates a herd, and sorts it by name, and prints it out:

constexpr size_t HSIZE = 7;
Elephant herd[HSIZE] = {{"Nellie", 7},
                        {"Kandula", 8},
                        {"Dumbo", 2},
                        {"Horton", 3},
                        {"Heffalump", 12},
                        {"Peanut", 1},
                        {"Mahmud", 9}};

sortElephantsByName(herd, herd+HSIZE);

for (Elephant* p = herd; p < herd+HSIZE; ++p) {
    cout << (*p).name_ << ", " << (*p).age_ << endl;
}

The sortElephantsByName function was declared as:

// Perform an in-place sort on a range of Elephants
void sortElephantsByName(Elephant* start, Elephant* pastTheEnd);

The details of sortElephantsByName need not concern us, but the key thing about sorting is that it usually involves swapping pairs of items until the sequence is sorted. And swapping often involves creating a temporary.

Thus, somewhere in the process, we'll have code that says something like:

Elephant temp = *p;
*p = *q
*q = temp;
  • Cat speaking

    Just so I'm clear, you're imagining p is of type Elephant*?

  • LHS Cow speaking

    Yes. The details of the sort function need not concern us, but it's working with an array, so it will work with pointers to array elements, in this case Elephants, so *p gives us an Elephant and p itself is a pointer to an Elephant, namely an Elephant*.

What's a concern about this code?

What's wrong with this code?

  • Goat speaking

    The (*p).name_ syntax is, like, super ugly.

  • LHS Cow speaking

    That wasn't really our main point, here.

  • RHS Cow speaking

    But since you mention it, we can instead write that as p->name_. It means the exact same thing.

  • LHS Cow speaking

    But there's something more fundamental.

The animated GIF below visualizes what's happening as we sort the array, with the single box on the right being our temp space for swapping. Every time you see an Elephant value move, we're copying data, and every time one goes up in flames, it's being overwritten due to assignment or destroyed.

Elephant Sorting Animation (Copying whole Elephants)

  • Hedgehog speaking

    Seeing all those Elephant objects get overwritten and destroyed is making me feel queasy.

What do you think? Does it seem like we're working too hard?

Indirection to the Rescue

  • LHS Cow speaking

    Can we think of something smaller than an Elephant that would be cheaper to move around?

  • Hedgehog speaking

    Oh no, it's going to be a pointer to an Elephant, isn't it?

  • Pig speaking

    MORE pointers! Oh yeah!!

We could have an array of Elephant* pointers and sort that.

Elephant* herdPtrs[HSIZE];

for (size_t i = 0; i < HSIZE; ++i) {
    herdPtrs[i] = herd + i;
}

sortElephantPtrsByName(herdPtrs, herdPtrs+HSIZE);

Now all the Elephants stay put, and only our array of pointers to Elephants is reordered. Given that Elephants are big, moving pointers is likely to to be faster. See the animated GIF below.

Elephant Sorting Animation (with Pointers to Elephants)

Figuring Out the Types

Previously, we had a sortElephantsByName function, with the signature

// Perform an in-place sort on a range of Elephants
void sortElephantsByName(Elephant* start, Elephant* pastTheEnd);

Now we'll use a sortElephantPtrsByName function, but what are the types of its arguments? In other words, what should go in the function signature below, where we've put XXXXX?

// Perform an in-place sort on a range of Elephant*s
void sortElephantPtrsByName(XXXXX start, XXXXX pastTheEnd);

What's the right type for the arguments to sortElephantPtrsByName?

Comparing the Two Approaches in Practice

The most important thing to understand here is that by using pointers, we were able to move just the pointers around rather than moving Elephants themselves.

  • LHS Cow speaking

    If the animated GIFs made sense, you've got the key point.

  • Cat speaking

    Sometimes it helps me to see things over a different way.

  • LHS Cow speaking

    Okay…

Here's two other ways to look at this code:

Option 1: Play with it yourself, in OnlineGDB

The code to focus on most is the code in main.cpp and look at output produced when you run the code.

When you run the code, be sure to scroll back in the console window to see all the output.

Option 2: Watch a video exploring what happens

The video below looks at the two approaches in more detail.

Avoiding Semantic Overload

Here's two versions of the code for printing out our elephants in sorted order.

for (Elephant** p = herdPtrs; p < herdPtrs+HSIZE; ++p) {
    cout << (**p).name_ << ", " << (**p).age_ << endl;
}

and

for (size_t i = 0; i < HSIZE; ++i) {
    cout << herdPtrs[i]->name_ << ", " << herdPtrs[i]->age_ << endl;
}

Both work. Which do you like better?

  • Pig speaking

    The first one!! MORE stars!

  • Hedgehog speaking

    If it has to be one of them, I guess the second.

  • LHS Cow speaking

    I like the second one, too. When I see two * next to each other, it makes my code feel complicated. Even when I do have pointers to pointers, I like to try to make it as easy to think about as possible.

  • RHS Cow speaking

    In CS 70, pointers to pointers don't crop up often, just when we work with arrays of pointers.

(When logged in, completion status appears here.)