Practical Advice for Numeric Types
“There are far too many integer types, there are far too lenient rules for mixing them together, and it’s a major bug source, which is why I’m saying stay as simple as you can, use [signed] integers til you really really need something else.”
Thanks for the advice Bjarne Stroustrup, creator of C++! But maybe you could just fix this problem and then we won't have to worry about it?
…Hello? Bjarne Stroustrup?
Ah well. I guess we'll just have to be careful about choosing types.
Which Type to Use
So, despite the plethora of integer types available, usually we can do fine with just a few.
- Use
int
when you have an integer from the user or input data, or it's not one of the cases below. - Use
char
when you're working with characters, as characters.- Even though
char
is typically an 8-bitint
, if you want an 8-bitint
, do not usechar
for that.char
is not guaranteed to be only eight bits, and it's also not guaranteed to be signed.
- Even though
- Use
size_t
when you want to count how many of something you're storing, or array indexes (both of which can only be non-negative).- The range of
size_t
varies based on the machine, and these are examples where the size is constrained by the machine itself (e.g., a 16-bit machine can't support huge arrays, but a 64-bit machine can). - If you have offsets that might be negative so you can't use
size_t
, useptrdiff_t
.
- The range of
- Use
int64_t
and friends if you know there is a specific size range for your data, or you want it to fit into a certain space,- If you want an 8-bit
int
,int8_t
has you covered (it'll be probably be some kind ofchar
).
- If you want an 8-bit
- All other things being equal, prefer signed types to unsigned ones.
- Google's C++ style guide says, "You should not use unsigned integers unless there is a valid reason, such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this."
- Try to avoid mixing signed and unsigned, especially for comparisons. So if you have to interoperate with code that is already using an unsigned type for a particular kind of value, it may be easiest to use an unsigned type to match.
- Use
double
when you have non-integer numerical data where scientific notation would be appropriate (especially in science and engineering contexts).- Floating point may not be appropriate for all contexts where large or fractional numbers are needed. For example, in the world of finance or discrete math, it may not be okay to have values that are only accurate to some number of significant figures.
Common Pitfalls
These are some common pitfalls you should look out for.
Forgetting that a variable Is Unsigned
Sometimes people just forget a variable is unsigned and try to write loops that that decrement the variable and exit when it goes negative. That's obviously impossible!
So, for example this loop will never finish:
size_t x = 9;
while (x >= 0) {
std::cout << "Still going... " << x << std::endl;
--x;
}
But this one will will print out the numbers 9
down to 0
and then stop:
size_t x = 10;
while (x > 0) {
--x;
std::cout << "Stopping soon... " << x << std::endl;
}
Overflowing a Signed Variable (Undefined Behavior!)
Somewhat strangely, although unsigned types wrap around, C and C++ say that if we cause a signed integer to go beyond its maximum or minimum values, the result is undefined behavior, so we're required to ensure we never do this.
For example, this program causes undefined behavior on every system where char
is 8 bits:
char x = 1000;
Similarly, the range of signed integer types is asymmetric, the smallest negative value has magnitude one bigger than the largest positive value.
For example, this program causes undefined behavior on every system where short
is 16 bits:
short int x = -32768;
x = -x;
Mixing Signed and Unsigned Types
The rules for what happens when you mix signed and unsigned are hard to remember and can be surprising.
For example, try
int x = -3;
unsigned int y = 1;
std::cout << x - y << std::endl;
On a system where int
is represented using 32 bits, it will print
4294967292
because the result of the sum was unsigned. There are a few ways we could fix this code.
One way to get what we would expect, is to be explicit about the conversions we want performed to get everything to be the same type:
int x = -3;
unsigned int y = 1;
std::cout << x - int(y) << std::endl;
But it would have been much easier if we had just begun with
int x = -3;
int y = 1;
std::cout << x - y << std::endl;
(When logged in, completion status appears here.)