CS 70

Generating Random Numbers in C++

  • Cat speaking

    When we implement our randomized insert, we can't exactly pull out an \( N \)-sided die every time we make a recursive call to insert!

  • Dog speaking

    Yeah. I mean, what if we can't find our 563-sided die?

  • LHS Cow speaking

    That's… definitely the biggest logistical challenge, Dog.

  • RHS Cow speaking

    But you're all right, of course—we can't have our insert function stop and wait for a die roll.

  • LHS Cow speaking

    Instead, we need to simulate the die roll (or coin toss) by generating a random number in code.

From Dice to C++

Unfortunately, while C++ provides expansive random-number generation facilities with many capabilities, none of them are especially simple to use.

In contrast, Java makes getting random numbers easy, with java.util.Random (although they aren't very good quality random numbers). Python has similar functionality in its random module.

C++'s random-number facility is more complex. That complexity gives us a lot of control as programmers, which matches what we'd expect from C++.

It also means that when you just want to do something like get a random number in a certain range, it can feel like a lot of scaffolding is needed to get there.

  • Duck speaking

    How about we use rand() from C? That's pretty easy.

  • LHS Cow speaking

    C's built-in random facilities are pretty primitive and have their own issues. And cpplint considers rand() problematic enough that it won't let you use it.

  • Duck speaking

    Okay, so can we do things the C++ way?

  • LHS Cow speaking

    We will. But the goal for this assignment is to learn about trees and recursion and iterators and things.

  • RHS Cow speaking

    We really, really don't want you spending tons of time on figuring out how to generate a random number properly in C++.

  • LHS Cow speaking

    So for this class, we provide you with a random-number generator in the CS 70 Docker container.

The CS 70 RandUInt32 Class

Our Docker image has a library called randuint32. You can include it in your code with:

#include <cs70/randuint32.hpp>

The library defines a class called RandUInt32.

Instances of that class can generate uniformly distributed random integers. You access those numbers by calling the class's get(…) member function, which takes in a number N, and returns a random integer between 0 and N-1 (and thus has N distinct possible values).

  • Goat speaking

    What if I call myGenerator.get(0)?

  • LHS Cow speaking

    Don't do that.

  • Pumpy speaking

    Maybe something spooky happens! Perhaps magic smoke comes out of the computer! You never know with undefined behavior!

Any time you want to compile and link code that uses the RandUInt32 class, you will need to tell the linker

  • -lranduint32 to link in the randuint32 library.

So if your program is called myrand.cpp, you would compile to a .o file like normal and then link with our random-number library. For example,

clang++ -c -std=c++17 -Wall -Wextra -pedantic myrand.cpp
clang++ -o myrand myrand.o -lranduint32
  • LHS Cow speaking

    Because this library is CS70-specific, you will need to compile code that uses it inside the CS 70 Docker container.

Your Turn

The following program generates a RandUInt32 object, then calls its get(…) member function 6000 times to generate numbers between 0 and 5 (i.e., six different outcomes) and summarizes the results:

#include <cs70/randuint32.hpp>
#include <cstddef>
#include <iostream>

constexpr size_t TOTAL_ROLLS = 6000;
constexpr size_t DICE_SIDES = 6;

int main() {
    RandUInt32 rand;

    size_t counts[DICE_SIDES] = {0, 0, 0, 0, 0, 0};
    for (size_t i = 0; i < TOTAL_ROLLS; ++i) {
        size_t roll = rand.get(DICE_SIDES);
        ++counts[roll];
    }

    std::cout << "Results from " << TOTAL_ROLLS << " dice rolls" << std::endl;
    for (size_t i = 0; i < DICE_SIDES; ++i) {
        std::cout << i << "\t" << counts[i] << std::endl;
    }
    return 0;
}

We'll give you two options for how to run this code. The first way mirrors how you'd use it in your own code for assignments. Use the link below to open the code in GitHub, and then clone that repository in Visual Studio Code and run it using our Docker container.

  • RHS Cow speaking

    Remember, you need to make VS Code run the program insider the container.

  • LHS Cow speaking

    You can build the program just by running make.

Or, alternatively, if that all seems like too much trouble, we've made it so you can run the code in Online GDB. (But for this case, we had to do a few tricks to let that environment use our library, so we made a small tweak to the #includes.)

How many times did the program roll a 3 when you ran this code?

When “Random” Isn't Actually Random

  • Duck speaking

    Are random numbers from RandUInt32 really random?

  • Alien speaking

    Is anything really random, avian creature? Perhaps your experience is the playback of a recording of some previous universe. Would your tiny, meat-based brain even notice?

  • LHS Cow speaking

    These are good questions, although maybe Feebleglorp could chill a bit.

  • RHS Cow speaking

    Let's think about whether “real” randomness, whatever that might be, is actually something we want.

When our algorithms need to make a random choice, they don't want to wait around while someone rolls real dice out in the world. They want to decide very quickly, otherwise that wait would become a bottleneck for the algorithm's performance.

In fact, modern random-number generators on today's machines are really fast. Good ones can produce a new random number in under a nanosecond!

To run quickly, random-number generators use deterministic algorithms that produce a long sequence of random-looking numbers that eventually cycles back and repeats.

  • Goat speaking

    Meh, that's not random at all!

  • LHS Cow speaking

    Well, that's one reason why sometimes they're called pseudorandom-number generators (PRNGs).

  • RHS Cow speaking

    But the truth is, the fixed sequence of outputs they produce is usually very long.

For example, the Mersenne Twister produces a staggering \( 2^{19937} - 1 \) outputs before it repeats itself (known as the period of the PRNG).

Other modern PRNGs are smaller than the Mersenne Twister, but still have staggeringly large periods.

If you start a modern PRNG from some arbitrary place in its period, you'll be producing a stream of random-looking numbers that no one on Earth has ever seen before.

  • Cat speaking

    I guess that seems kinda random.

  • Goat speaking

    Meh. Not random enough for me!

  • LHS Cow speaking

    Why not? Can you envision a scenario where a sequence of random-looking numbers no one has ever seen before wouldn't feel random when you saw them?

  • Goat speaking

    Meh. I don't know. I bet they aren't as random as real randomness.

  • LHS Cow speaking

    Oddly enough, the opposite is often true!

A lot of real-world randomness has some bias to it (e.g., flipping a fair coin is biased towards the side that was up when you started), and some processing is required to remove that bias. In contrast, good PRNGs produce random sequences that pass a broad range of statistical tests.

For many PRNGs, even if you see lots of their output, you still can't easily figure out where in their deterministic sequence they are, so they aren't preditable in that sense.

But PRNGs do provide reproducability. If you start the sequence of outputs at the same place, you get the same results.

  • Dog speaking

    Deterministic behavior that feels random?!? Cool!!

  • LHS Cow speaking

    It is cool!

Seeding...

Where you start your PRNG is known as the seed.

If you give your PRNG the same seed twice, it will produce the same sequence of numbers both times!

If your program's behavior depends on that sequence of numbers, you'll get the same outcomes in both runs, even though your program's behavior is “random”.

When you default construct our RandUInt32 class, it chooses the seed randomly.

  • Horse speaking

    Whoah, whoah! You need randomness before you can have randomness? I thought we had all these reasons not to use real randomness!

  • LHS Cow speaking

    One way to look at things is that PRNGs are an efficient amplifier for some initial randomness.

  • Duck speaking

    Where does the initial randomness for the seed come from?

  • LHS Cow speaking

    Good question!

  • RHS Cow speaking

    One big problem with C++'s (and C's) PRNG facilities is that they tend to assume that the programmer can figure that out somehow.

  • LHS Cow speaking

    Lots of programmers get this part wrong when they have to do it.

  • Goat speaking

    Yeah, because computers can't be random. Meh.

  • LHS Cow speaking

    Actually, real-world computers have lots of sources of unpredictable values (sometimes called entropy sources).

One example of an entropy source is the high-resolution clock in the machine. It would be very hard to know exactly which nanosecond of the clock we started our program, so we can use that (perhaps scrambled up a bit) to make a seed value.

Even better-quality randomess is maintained by the operating system, based on all kinds of entropy sources the computer has access to (like when you last pressed a key, or when network packets arrive).

The key thing is that although a random seed value might be a bit slow to compute (and require us to find a good entropy source to use), once we've initialized the generator, it can then run quickly, generating lots of good (pseudorandom) randomness.

  • RHS Cow speaking

    And remember, the default constructor for our RandUInt32 class takes care of finding entropy to make a random initial seed for you, so you don't have to worry about it!

  • Duck speaking

    But I could pick a non-random seed instead?

  • LHS Cow speaking

    Yes! In our RandUInt32 class, if you provide a constructor argument, it uses that integer to choose the starting point instead.

  • Duck speaking

    Why would I do that?

  • LHS Cow speaking

    Mostly for debugging. Setting the seed means you can get the same behavior from your program every time, even with “random“ numbers.

  • RHS Cow speaking

    For example, if your program segfaults in some runs but not others, you could set the seed yourself to reproduce the error consistently.

  • LHS Cow speaking

    That way you can investigate what's happening a lot more easily!

A Quick Demo

This program seeds the random-number generator before generating 10 numbers between 0 and 100.

Edit your myrand program to change main to match the code below, recompile and run it, and let us know what happens!

int main() {
    RandUInt32 rand_{1994}; // Here, our seed is a favorite year....

    for (size_t i=0; i < 10; ++i) {
        if (i > 0) {
            std::cout << ", ";
        }
        std::cout << rand_.get(100);
    }
    std::cout << std::endl;

    return 0;
}

What did the code print?

And how plausible does that feel?

If you think that's crazy, try the seed 1279079580, especially if you consider 3 to be a lucky number.

Final Thoughts on Seeding & Usage

  • Cat speaking

    I was pretty surprised when I saw 42 appear three times in a row!

  • Dog speaking

    That's nothing, that other seed had a pretty serious amount of three-ness.

  • LHS Cow speaking

    Yes, those examples are what Prof. Melissa calls a “PRNG party trick”.

  • RHS Cow speaking

    One way to come up with party tricks is to just step through seeds until you find one that does what you want.

  • Dog speaking

    Maybe you could cheat at the lottery!!

  • LHS Cow speaking

    Lotteries generally don't use PRNGs, but if they did, they'd be very careful not to let anyone control how the seed is generated.

  • RHS Cow speaking

    In this class, as we've mentioned, our goal for using a specific seed is not for party tricks—

  • LHS Cow speaking

    —it's so we can get the same reproducible behavior over and over for debugging!

  • Cat speaking

    So, what should we take away from this discussion?

Takeaways

  • Because RandUInt32 can seed itself with a good seed, only override it with a specific seed as a bonus feature for debugging.
  • Although RandUInt32 can seed itself, that doesn't mean you should create new RandUInt32 objects all the time. As much as possible, just create one that gets used a lot.
    • For example, in the homework, one RandUInt32 object per tree object is a good way to go.
  • Use the get() member function to get random numbers within a certain range.

Bonus Fun (Optional)

  • LHS Cow speaking

    Prof. Melissa is the designer of a (pseudo)random-number generator known as PCG that has a number of desirable properties.

  • RHS Cow speaking

    It was chosen as the default PRNG for NumPy!

  • LHS Cow speaking

    Prof. Melissa has a website for PCG and a blog which covers various interesting topics related to PRNGs.

  • RHS Cow speaking

    She also has a different library that makes using C++ RNGs easier called randutils, very much modelled after Python's random-number library. It's described in this blog post.

  • LHS Cow speaking

    It might be useful if you need easy-to-use random numbers in other projects, But we recommend you use RandUInt32 in this assignment.

If you're interested in how RandUInt32 is implemented, you can find the code on this page:

(When logged in, completion status appears here.)