Generating Random Numbers in C++
When we implement our randomized insert, we can't exactly pull out an \( N \)-sided die every time we make a recursive call to insert!
Yeah. I mean, what if we can't find our 563-sided die?
That's… definitely the biggest logistical challenge, Dog.
But you're all right, of course—we can't have our insert function stop and wait for a die roll.
Instead, we need to simulate the die roll (or coin toss) by generating a random number in code.
From Dice to C++
Unfortunately, while C++ provides expansive random-number generation facilities with many capabilities, none of them are especially simple to use.
In contrast, Java makes getting random numbers easy, with java.util.Random
(although they aren't very good quality random numbers). Python has similar functionality in its random
module.
C++'s random-number facility is more complex. That complexity gives us a lot of control as programmers, which matches what we'd expect from C++.
It also means that when you just want to do something like get a random number in a certain range, it can feel like a lot of scaffolding is needed to get there.
How about we use
rand()
from C? That's pretty easy.C's built-in random facilities are pretty primitive and have their own issues. And
cpplint
considersrand()
problematic enough that it won't let you use it.Okay, so can we do things the C++ way?
We will. But the goal for this assignment is to learn about trees and recursion and iterators and things.
We really, really don't want you spending tons of time on figuring out how to generate a random number properly in C++.
So for this class, we provide you with a random-number generator in the CS 70 Docker container.
The CS 70 RandUInt32
Class
Our Docker image has a library called randuint32. You can include it in your code with:
#include <cs70/randuint32.hpp>
The library defines a class called RandUInt32
.
Instances of that class can generate uniformly distributed random integers. You access those numbers by calling the class's get(…)
member function, which takes in a number N
, and returns a random integer between 0
and N-1
(and thus has N
distinct possible values).
What if I call
myGenerator.get(0)
?Don't do that.
Maybe something spooky happens! Perhaps magic smoke comes out of the computer! You never know with undefined behavior!
Any time you want to compile and link code that uses the RandUInt32
class, you will need to tell the linker
-lranduint32
to link in theranduint32
library.
So if your program is called myrand.cpp
, you would compile to a .o
file like normal and then link with our random-number library. For example,
clang++ -c -std=c++17 -Wall -Wextra -pedantic myrand.cpp
clang++ -o myrand myrand.o -lranduint32
Because this library is CS70-specific, you will need to compile code that uses it inside the CS 70 Docker container.
Your Turn
The following program generates a RandUInt32
object, then calls its get(…)
member function 6000 times to generate numbers between 0 and 5 (i.e., six different outcomes) and summarizes the results:
#include <cs70/randuint32.hpp>
#include <cstddef>
#include <iostream>
constexpr size_t TOTAL_ROLLS = 6000;
constexpr size_t DICE_SIDES = 6;
int main() {
RandUInt32 rand;
size_t counts[DICE_SIDES] = {0, 0, 0, 0, 0, 0};
for (size_t i = 0; i < TOTAL_ROLLS; ++i) {
size_t roll = rand.get(DICE_SIDES);
++counts[roll];
}
std::cout << "Results from " << TOTAL_ROLLS << " dice rolls" << std::endl;
for (size_t i = 0; i < DICE_SIDES; ++i) {
std::cout << i << "\t" << counts[i] << std::endl;
}
return 0;
}
We'll give you two options for how to run this code. The first way mirrors how you'd use it in your own code for assignments. Use the link below to open the code in GitHub, and then clone that repository in Visual Studio Code and run it using our Docker container.
Remember, you need to make VS Code run the program insider the container.
You can build the program just by running
make
.
Or, alternatively, if that all seems like too much trouble, we've made it so you can run the code in Online GDB. (But for this case, we had to do a few tricks to let that environment use our library, so we made a small tweak to the #include
s.)
When “Random” Isn't Actually Random
Are random numbers from
RandUInt32
really random?Is anything really random, avian creature? Perhaps your experience is the playback of a recording of some previous universe. Would your tiny, meat-based brain even notice?
These are good questions, although maybe Feebleglorp could chill a bit.
Let's think about whether “real” randomness, whatever that might be, is actually something we want.
When our algorithms need to make a random choice, they don't want to wait around while someone rolls real dice out in the world. They want to decide very quickly, otherwise that wait would become a bottleneck for the algorithm's performance.
In fact, modern random-number generators on today's machines are really fast. Good ones can produce a new random number in under a nanosecond!
To run quickly, random-number generators use deterministic algorithms that produce a long sequence of random-looking numbers that eventually cycles back and repeats.
Meh, that's not random at all!
Well, that's one reason why sometimes they're called pseudorandom-number generators (PRNGs).
But the truth is, the fixed sequence of outputs they produce is usually very long.
For example, the Mersenne Twister produces a staggering \( 2^{19937} - 1 \) outputs before it repeats itself (known as the period of the PRNG).
Other modern PRNGs are smaller than the Mersenne Twister, but still have staggeringly large periods.
If you start a modern PRNG from some arbitrary place in its period, you'll be producing a stream of random-looking numbers that no one on Earth has ever seen before.
I guess that seems kinda random.
Meh. Not random enough for me!
Why not? Can you envision a scenario where a sequence of random-looking numbers no one has ever seen before wouldn't feel random when you saw them?
Meh. I don't know. I bet they aren't as random as real randomness.
Oddly enough, the opposite is often true!
A lot of real-world randomness has some bias to it (e.g., flipping a fair coin is biased towards the side that was up when you started), and some processing is required to remove that bias. In contrast, good PRNGs produce random sequences that pass a broad range of statistical tests.
For many PRNGs, even if you see lots of their output, you still can't easily figure out where in their deterministic sequence they are, so they aren't preditable in that sense.
But PRNGs do provide reproducability. If you start the sequence of outputs at the same place, you get the same results.
Deterministic behavior that feels random?!? Cool!!
It is cool!
Seeding...
Where you start your PRNG is known as the seed.
If you give your PRNG the same seed twice, it will produce the same sequence of numbers both times!
If your program's behavior depends on that sequence of numbers, you'll get the same outcomes in both runs, even though your program's behavior is “random”.
When you default construct our RandUInt32
class, it chooses the seed randomly.
Whoah, whoah! You need randomness before you can have randomness? I thought we had all these reasons not to use real randomness!
One way to look at things is that PRNGs are an efficient amplifier for some initial randomness.
Where does the initial randomness for the seed come from?
Good question!
One big problem with C++'s (and C's) PRNG facilities is that they tend to assume that the programmer can figure that out somehow.
Lots of programmers get this part wrong when they have to do it.
Yeah, because computers can't be random. Meh.
Actually, real-world computers have lots of sources of unpredictable values (sometimes called entropy sources).
One example of an entropy source is the high-resolution clock in the machine. It would be very hard to know exactly which nanosecond of the clock we started our program, so we can use that (perhaps scrambled up a bit) to make a seed value.
Even better-quality randomess is maintained by the operating system, based on all kinds of entropy sources the computer has access to (like when you last pressed a key, or when network packets arrive).
The key thing is that although a random seed value might be a bit slow to compute (and require us to find a good entropy source to use), once we've initialized the generator, it can then run quickly, generating lots of good (pseudorandom) randomness.
And remember, the default constructor for our
RandUInt32
class takes care of finding entropy to make a random initial seed for you, so you don't have to worry about it!But I could pick a non-random seed instead?
Yes! In our
RandUInt32
class, if you provide a constructor argument, it uses that integer to choose the starting point instead.Why would I do that?
Mostly for debugging. Setting the seed means you can get the same behavior from your program every time, even with “random“ numbers.
For example, if your program segfaults in some runs but not others, you could set the seed yourself to reproduce the error consistently.
That way you can investigate what's happening a lot more easily!
A Quick Demo
This program seeds the random-number generator before generating 10 numbers between 0 and 100.
Edit your myrand
program to change main
to match the code below, recompile and run it, and let us know what happens!
int main() {
RandUInt32 rand_{1994}; // Here, our seed is a favorite year....
for (size_t i=0; i < 10; ++i) {
if (i > 0) {
std::cout << ", ";
}
std::cout << rand_.get(100);
}
std::cout << std::endl;
return 0;
}
If you think that's crazy, try the seed 1279079580, especially if you consider 3 to be a lucky number.
Final Thoughts on Seeding & Usage
I was pretty surprised when I saw 42 appear three times in a row!
That's nothing, that other seed had a pretty serious amount of three-ness.
Yes, those examples are what Prof. Melissa calls a “PRNG party trick”.
One way to come up with party tricks is to just step through seeds until you find one that does what you want.
Maybe you could cheat at the lottery!!
Lotteries generally don't use PRNGs, but if they did, they'd be very careful not to let anyone control how the seed is generated.
In this class, as we've mentioned, our goal for using a specific seed is not for party tricks—
—it's so we can get the same reproducible behavior over and over for debugging!
So, what should we take away from this discussion?
Takeaways
- Because
RandUInt32
can seed itself with a good seed, only override it with a specific seed as a bonus feature for debugging. - Although
RandUInt32
can seed itself, that doesn't mean you should create newRandUInt32
objects all the time. As much as possible, just create one that gets used a lot.- For example, in the homework, one
RandUInt32
object per tree object is a good way to go.
- For example, in the homework, one
- Use the
get()
member function to get random numbers within a certain range.
Bonus Fun (Optional)
Prof. Melissa is the designer of a (pseudo)random-number generator known as PCG that has a number of desirable properties.
It was chosen as the default PRNG for NumPy!
Prof. Melissa has a website for PCG and a blog which covers various interesting topics related to PRNGs.
She also has a different library that makes using C++ RNGs easier called
randutils
, very much modelled after Python's random-number library. It's described in this blog post.It might be useful if you need easy-to-use random numbers in other projects, But we recommend you use
RandUInt32
in this assignment.
If you're interested in how RandUInt32
is implemented, you can find the code on this page:
(When logged in, completion status appears here.)