CS 70

Practical Advice for Numeric Types

  • Bjarne speaking

    “There are far too many integer types, there are far too lenient rules for mixing them together, and it’s a major bug source, which is why I’m saying stay as simple as you can, use [signed] integers til you really really need something else.”

  • LHS Cow speaking

    Thanks for the advice Bjarne Stroustrup, creator of C++! But maybe you could just fix this problem and then we won't have to worry about it?

  • RHS Cow speaking

    …Hello? Bjarne Stroustrup?

  • LHS Cow speaking

    Ah well. I guess we'll just have to be careful about choosing types.

Which Type to Use

So, despite the plethora of integer types available, usually we can do fine with just a few.

  • Use int when you have an integer from the user or input data, or it's not one of the cases below.
  • Use char when you're working with characters, as characters.
    • Even though char is typically an 8-bit int, if you want an 8-bit int, do not use char for that. char is not guaranteed to be only eight bits, and it's also not guaranteed to be signed.
  • Use size_t when you want to count how many of something you're storing, or array indexes (both of which can only be non-negative).
    • The range of size_t varies based on the machine, and these are examples where the size is constrained by the machine itself (e.g., a 16-bit machine can't support huge arrays, but a 64-bit machine can).
    • If you have offsets that might be negative so you can't use size_t, use ptrdiff_t.
  • Use int64_t and friends if you know there is a specific size range for your data, or you want it to fit into a certain space,
    • If you want an 8-bit int, int8_t has you covered (it'll be probably be some kind of char).
  • All other things being equal, prefer signed types to unsigned ones.
    • Google's C++ style guide says, "You should not use unsigned integers unless there is a valid reason, such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this."
    • Try to avoid mixing signed and unsigned, especially for comparisons. So if you have to interoperate with code that is already using an unsigned type for a particular kind of value, it may be easiest to use an unsigned type to match.
  • Use double when you have non-integer numerical data where scientific notation would be appropriate (especially in science and engineering contexts).
    • Floating point may not be appropriate for all contexts where large or fractional numbers are needed. For example, in the world of finance or discrete math, it may not be okay to have values that are only accurate to some number of significant figures.

Common Pitfalls

These are some common pitfalls you should look out for.

Forgetting that a variable Is Unsigned

Sometimes people just forget a variable is unsigned and try to write loops that that decrement the variable and exit when it goes negative. That's obviously impossible!

So, for example this loop will never finish:

size_t x = 9;
while (x >= 0) {
    std::cout << "Still going... " << x << std::endl;
    --x;
}

But this one will will print out the numbers 9 down to 0 and then stop:

size_t x = 10;
while (x > 0) {
    --x;
    std::cout << "Stopping soon... " << x << std::endl;
}

Overflowing a Signed Variable (Undefined Behavior!)

Somewhat strangely, although unsigned types wrap around, C and C++ say that if we cause a signed integer to go beyond its maximum or minimum values, the result is undefined behavior, so we're required to ensure we never do this.

For example, this program causes undefined behavior on every system where char is 8 bits:

char x = 1000;

Similarly, the range of signed integer types is asymmetric, the smallest negative value has magnitude one bigger than the largest positive value.

For example, this program causes undefined behavior on every system where short is 16 bits:

short int x = -32768;
x = -x;

Mixing Signed and Unsigned Types

The rules for what happens when you mix signed and unsigned are hard to remember and can be surprising.

For example, try

int x = -3;
unsigned int y = 1;
std::cout << x - y << std::endl;

Stop for a moment and think.

What do you think the code above will print?

On a system where int is represented using 32 bits, it will print

4294967292

because the result of the sum was unsigned. There are a few ways we could fix this code.

One way to get what we would expect, is to be explicit about the conversions we want performed to get everything to be the same type:

int x = -3;
unsigned int y = 1;
std::cout << x - int(y) << std::endl;

But it would have been much easier if we had just begun with

int x = -3;
int y = 1;
std::cout << x - y << std::endl;

(When logged in, completion status appears here.)