Accessing Array Elements
So we know how to make an array. How do we actually access the things in the array?
Good question! In the end, the answer will be very familiar and comfortable.
But, in the spirit of CS 70, we are going to explain some of the underlying details first.
That will help you predict what will happen when working with arrays and it will lay some groundwork for important topics later on.
The Value of an Array Variable
Say we make an array:
float myArr[3]{1.2, 3.4, 5.6};
Here is what that array would look like in memory:
We can see the values of the float
s in the array. But the variable we just created is named myArr
.
What is the value of myArr
?
Key ideas:
- When we ask for the “value” of an array variable, we get its memory address.
- Specifically, the memory address of the first item in the array.
- In general, the memory address of something that takes up multiple slots will be the first address in the thing.
- So in our example, using our memory model, trying to get the value of
myArr
gives us its address on the stack: S1.
So, if
myArr
has a value, why isn't it on our diagram?Great question! First, we could say that
myArr
is on the diagram. The problem is that our array consists of three memory locations holding three distinct values (1.2, 3.4, and 5.6), but when we saymyArr
, we're expecting to get just one value to somehow represent the whole array. For reasons we'll appreciate more in a bit, what C++ does for arrays is—instead of just giving an error, or giving us the just first value in the array (1.2)—it gives us the memory address of the first element (in this case, S1).So
myArr
is the address of the first element, like we said. But does that mean we could saymyArr = myOtherArr;
to change it to some other address.No, you can't say that.
myArr
is fixed to always be the array we made. It always stays put at S1 for the duration of the function, just like other non-array local variables.Technically, it's not that the value of
myArr
is S1, it's that it decays to S1 because there is no such thing as the “value of a whole array”.But doesn't it need to store the address in a space somewhere so we have it when we need it?
No, there doesn't need to be space to store
myArr
's address because the compiler knows where it put it at compile time!So whenever we write
myArr
just by itself, the program doesn't have to read the value of a variable to figure out the address, it just knows the address of the array?Exactly.
Consider the following code snippet:
int main() {
float num = 3.14;
int a[5]{11, 22, 33, 44, 55};
return 0;
}
Accessing the Value at an Address
C++ gives us the ability to get the value at a given memory address!
If you type *x
, you're saying “the value at the memory address x
”.
The *
is called the indirection operator. Because when we have an address, we don't have the thing itself (we don't have it directly); instead we have something that tells us where it is (so it gives us the thing indirectly). The *
operator gives us the actual thing in that piece of memory.
So, in this example, x
is like a treasure map, and *x
is the treasure.
Yarr.
x
marks the spot!Yes, and
*x
is the stuff at the spotx
marks.Backing up a bit, you said
*
was called the indirection operator, but I had heard that it was called the dereference operator?That's another name for it, and that's what C programmers call it. Frankly, most of the time we'll just read
*x
as “star x” because that's less of a mouthful.But the official name for it in C++ is the indirection operator.
Your professors (and some of the course materials) may sometimes say “dereference” out of habit, but we're trying to avoid that term these days as it can be a bit too easy to mix it up with some other unrelated C++ concepts.
So, back to our original example:
float myArr[3]{1.2, 3.4, 5.6};
cout << *myArr << endl;
What gets printed?
Hints:
- Remember, we decided that the value of
myArr
is S1. - You might want to refer back to our memory diagram above!
So, we can access the value in the first position of
myArr
, but—We can also change that value!
If we have
float myArr[3]{1.2, 3.4, 5.6};
*myArr = 7.8;
then the memory diagram looks like
Assigning a value to *myArr
is the same as assigning a value to the first float
in the array!
The Name of an Array Item
So
*myArr
returns the value of the first item.Yep!
And
*myArr = x
sets the value of the first item tox
.Also, yes.
So… it seems like
*myArr
is just the name of the first item.YES!
Though it's not literally a variable name (it's an operator applied to a variable), *myArr
is, for all intents and purposes, a name for the first item in the array. We can get its value and we can assign a value to it.
So, on our diagram, we'll just write *myArr
next to the first item as its name! Anytime we say *myArr
in our code, we are talking about that item in the array.
Recall this code snippet:
int main() {
float num = 3.14;
int a[5]{11, 22, 33, 44, 55};
return 0;
}
Accessing Other Items
What's the point of having an array if you can only access the first item?
That would indeed be silly. Thankfully, that's not our situation!
We just need a bit more syntax to make it work.
If x
stores a memory address, then *x
is the value at that memory address.
We can also get memory addresses at offsets from x
. Specifically, x + 1
is also a memory address. It's the "next" memory address.
So, if myArr
has the value S1, then myArr + 1
has the value S2.
So then we can get the value of the second item!
Yes, that's right!
We know that *myArr
is a name for the first item, because *x
means the value at address x
.
In that case, *(myArr + 1)
is a name for the second item, because myArr + 1
gives the address S2.
So now we can label every item in the array with a name!
Deeper Dive: Adding to a Memory Address
If you get the general idea, you can skip this deeper dive, but if it all seems a bit odd, or if you want a slightly deeper understanding, keep reading.
The truth is that every byte in memory has an address. So, if x
is the address of a byte (say, 0xfb34a82
), you might imagine that x + 1
is just the address of the next byte (say, 0xfb34a83
).
The problem with this is that most types take up more than one byte. For instance, in the most common systems an int
takes up 4 bytes of memory. The address of an int
is the address of the first byte in that int
. But the next byte is just somewhere inside the int
! That wouldn't be very helpful for accessing items in an array.
So the truth is that if x
is the address of (the first byte of) an int
(say, 0xfb34a82), then x + 1
is the address of (the first byte of) the next int
(0xfb34a86
, 4 bytes further because an int
takes up 4 bytes).
In our CS 70 memory model we don't worry about addresses at the byte-level, so we don't need to worry about this subtlety.
Indexing Syntax
This all seems really complicated. Whatever happened to
myArr[0]
, like in Java?Oh, we can do that too!
WHAT?? SERIOUSLY?
Yeah, totally. That works in C++ too!
If myArr
is an array, then myArr[i]
is exactly equivalent to *(myArr + i)
, and is a name for the item at index i
.
So we can redraw our diagram as
If we can just use the square brackets like usual, why would you show us all this stuff about memory addresses and stars and stuff?!??
We did say that it would be familiar by the end!
Understanding what the square brackets mean will help you make predictions about what might happen when you use them in unfamiliar situations.
Also, memory addresses and stars are going to be important this semester. This is just the first encounter!
Summary
Suppose that we declare an array a
as int a[5];
.
- Using the variable name
a
by itself gives us a memory address.- Specifically,
a
gives us the memory address of the first item in the array.
- Specifically,
- To access the item at index
i
, you can usea[i]
. - The compiler literally changes the syntax
a[i]
into this equivalent form:*(a + i)
.a + i
is the memory address at an offset ofi
from the address given bya
.- So, in effect,
a + i
is the memory address of the item at indexi
, and *(a + i)
is the actual value at the address given bya + i
.
Mysterious Mysteries Demystified (Optional)
Now, after all this time, you finally know why computer scientists start indexing at 0! Because really the index is an offset from the address of the first element.
Gasp!
Gasp!
Gasp!!
Arrr!
These days many high-level languages don't give direct access to memory addresses like this, but still 0-index because of tradition and backwards compatibility.
Python could easily index lists starting at 1; they just don't because programmers are so used to 0-indexing.
Matlab, on the other hand, 1-indexes containers because it is more targeted at mathematicians, who are used to 1-indexing vectors and matrices.
Either way, deep down it's all about memory addresses!
And gold!
Shh!
(When logged in, completion status appears here.)