Saving Time by Not Moving/Copying Data
We've already seen how using references can help us avoid making needless copies of data. Now we'll look at how pointers can help us avoid some related problems.
An Example that Works Too Hard
Let's suppose we have an Elephant
class with this definition, and some code that uses it.
class Elephant {
public:
Elephant();
Elephant(const std::string& name, int age);
Elephant(const Elephant& other);
~Elephant();
Elephant& operator=(const Elephant& rhs);
void remember(std::string event);
void storyToStream(std::ostream& out);
std::string name_ = "<unnamed>";
int age_ = -1;
private:
size_t numMemories_ = 0;
std::array<std::string, 100> memories_;
};
And this code that creates a herd, and sorts it by name, and prints it out:
constexpr size_t HSIZE = 7;
Elephant herd[HSIZE] = {{"Nellie", 7},
{"Kandula", 8},
{"Dumbo", 2},
{"Horton", 3},
{"Heffalump", 12},
{"Peanut", 1},
{"Mahmud", 9}};
sortElephantsByName(herd, herd+HSIZE);
for (Elephant* p = herd; p < herd+HSIZE; ++p) {
cout << (*p).name_ << ", " << (*p).age_ << endl;
}
The sortElephantsByName
function was declared as:
// Perform an in-place sort on a range of Elephants
void sortElephantsByName(Elephant* start, Elephant* pastTheEnd);
The details of sortElephantsByName
need not concern us, but the key thing about sorting is that it usually involves swapping pairs of items until the sequence is sorted. And swapping often involves creating a temporary.
Thus, somewhere in the process, we'll have code that says something like:
Elephant temp = *p;
*p = *q
*q = temp;
Just so I'm clear, you're imagining
p
is of typeElephant*
?Yes. The details of the sort function need not concern us, but it's working with an array, so it will work with pointers to array elements, in this case
Elephant
s, so*p
gives us anElephant
andp
itself is a pointer to anElephant
, namely anElephant*
.
What's wrong with this code?
The
(*p).name_
syntax is, like, super ugly.That wasn't really our main point, here.
But since you mention it, we can instead write that as
p->name_
. It means the exact same thing.But there's something more fundamental.
The animated GIF below visualizes what's happening as we sort the array, with the single box on the right being our temp
space for swapping. Every time you see an Elephant
value move, we're copying data, and every time one goes up in flames, it's being overwritten due to assignment or destroyed.
Seeing all those
Elephant
objects get overwritten and destroyed is making me feel queasy.
Indirection to the Rescue
Can we think of something smaller than an
Elephant
that would be cheaper to move around?Oh no, it's going to be a pointer to an
Elephant
, isn't it?MORE pointers! Oh yeah!!
We could have an array of Elephant*
pointers and sort that.
Elephant* herdPtrs[HSIZE];
for (size_t i = 0; i < HSIZE; ++i) {
herdPtrs[i] = herd + i;
}
sortElephantPtrsByName(herdPtrs, herdPtrs+HSIZE);
Now all the Elephant
s stay put, and only our array of pointers to Elephant
s is reordered. Given that Elephant
s are big, moving pointers is likely to to be faster. See the animated GIF below.
Figuring Out the Types
Previously, we had a sortElephantsByName
function, with the signature
// Perform an in-place sort on a range of Elephants
void sortElephantsByName(Elephant* start, Elephant* pastTheEnd);
Now we'll use a sortElephantPtrsByName
function, but what are the types of its arguments? In other words, what should go in the function signature below, where we've put XXXXX
?
// Perform an in-place sort on a range of Elephant*s
void sortElephantPtrsByName(XXXXX start, XXXXX pastTheEnd);
Comparing the Two Approaches in Practice
The most important thing to understand here is that by using pointers,
we were able to move just the pointers around rather than moving Elephants
themselves.
If the animated GIFs made sense, you've got the key point.
Sometimes it helps me to see things over a different way.
Okay…
Here's two other ways to look at this code:
Option 1: Play with it yourself, in OnlineGDB
The code to focus on most is the code in main.cpp
and look at output produced when you run the code.
- Open Elephant Code in OnlineGDB and run it.
- Open Elephant-Pointer Code in OnlineGDB and run it.
When you run the code, be sure to scroll back in the console window to see all the output.
Option 2: Watch a video exploring what happens
The video below looks at the two approaches in more detail.
Avoiding Semantic Overload
Here's two versions of the code for printing out our elephants in sorted order.
for (Elephant** p = herdPtrs; p < herdPtrs+HSIZE; ++p) {
cout << (**p).name_ << ", " << (**p).age_ << endl;
}
and
for (size_t i = 0; i < HSIZE; ++i) {
cout << herdPtrs[i]->name_ << ", " << herdPtrs[i]->age_ << endl;
}
Both work. Which do you like better?
The first one!! MORE stars!
If it has to be one of them, I guess the second.
I like the second one, too. When I see two
*
next to each other, it makes my code feel complicated. Even when I do have pointers to pointers, I like to try to make it as easy to think about as possible.In CS 70, pointers to pointers don't crop up often, just when we work with arrays of pointers.
(When logged in, completion status appears here.)