Writing Tests
Good testing requires attention to detail, systematic thinking, and creativity.
Here are some big ideas to keep in mind when you write tests. (You'll also want to read the documentation for the CS 70 testing library.)
Testing is an Ethical Responsibility
Faulty software can cause real damage. You may recall the story of the Therac 25 radiation therapy machine from Week 2. That's an example where inadequate testing directly caused loss of life. Though the stakes aren't always quite that high, bugs can also cause security/privacy failures, and significant loss of time, productivity, or revenue.
Learning to test and debug is part of learning to program. Testing and debugging are not extra work that will hopefully not be necessary; they are always part of the process!
It's also good for your grade!
Testing is also good for your grade! Good testing will reveal bugs in your code so that you can fix them before you submit, thus improving your correctness grade. In addition, we will be evaluating your tests by applying them to both correct and incorrect implementations to see how many bugs you can catch!
So try to approach the testing component of each assignment with just as much creative energy and eagerness to learn as the implementation component.
Boundaries of What Can Be Tested
There are some limitations as to what your tests can do, however.
Your Tests Must Only Use the Public Interface
In principle, you should be able to write your tests before you write a single line of code implementing the thing you are testing. All you need to know is what each function/operator promises to do!
Your testing suite should be applicable to any implementation of the supplied interface. For this reason, you must only test using the public interface.
You might be tempted to look inside. You might want to say, "If I perform this operation, something interesting should happen to the member variables and I want to make sure that happens."
The problem is this: someone else could write a totally different, but totally correct implementation of the interface, and your tests would fail. Even if you aren't worried about other people's implementations, you might want to change your implementation at some point. You shouldn't have to change your tests too.
Also… your testing functions only have access to the public interface and can't access private information so there's that too.
Your Tests Must Not Break the Rules
When you use any piece of code functionality, there are often rules to follow (e.g., “you must always …” or “you must not …”). But whereas in some languages the convention is to always detect misuse and give an error, C++ interfaces typically make no promises that misuse will be detected, and don't specify what will happen if misuse occurs.
In other words, C++'s legendary “undefined behavior.”
Exactly.
But can I test “undefined behavior”?
No!
Trying to test undefined behavior doesn't make sense—by definition there is no known correct outcome to test against. Your tests must follow all usage rules given in the specification.
Your Tests Must Not Make Extra Assumptions
But if I know my code always returns zero for undefined behavior, can't I test that?
No. Because your tests aren't just for your implementation.
We will run your tests on a correct implementation that may differ significantly from your own.
And If You Don't…
Each one of the above sections created boundaries for testing. If you don't stay within these boundaries, and thus
- Try to test something that wasn't in the specified public interface,
- Write a test that does something that the public interface said wasn't allowed, or
- Lock your tests to specific details of your implementation that aren't part of the specified interface
then it is highly likely that your tests will fail on other correct implementations, which is a big problem.
In fact, if your tests fail a known-correct implementation, you will get a score of ZERO for your tests!
What!!? Why…?
If your tests claim that a known-correct implementation is wrong, something is seriously wrong with the tests. Either they
- Make invalid assumptions, or
- Break the rules and thus trigger undefined behavior (which is even worse!).
There is no way to trust the reliability of your tests if you fail correct implementations—your tests might fail all of our incorrect implementations too, but for the wrong reasons. A valid testing suite must reliably distinguish between a correct and an incorrect implementation.
Because it is such a huge deal, the autograder will warn you if your tests fail the correct implementation. That will allow you to hunt down the flawed test and eliminate it. You should plan to submit your project well before the deadline (even if it is not quite complete), in case the autograder's output contains important warnings. You can submit as many times as you like.
Okay, that's what we shouldn't do, but what should we do?
Practical Advice
Keep Your Tests Limited and Focused
One of the goals of testing is to determine if something is wrong. Another goal, nearly as important, is to help determine where in the code it is wrong.
If your test is focused on the behavior of a specific, manageable subset of the code, you have a small debugging region when the test fails.
Should each test test exactly one piece of functionality?
Small tests are good, but that is probably too small.
Since you can only use the public interface, you will often have to perform a number of operations just to set up a scenario so you can test. Often you will really be testing relationships between operations rather than single isolated operations (e.g., if I insert something, do I get the right result from the size function?).
Rule of thumb: try to arrange your tests so that each one tests a very limited set of new operations/cases, otherwise making use of things that have already been tested.
Adopt the Right Mindset for Testing
At its core, testing is about finding bugs. Rather than trying to prove that your own code is correct, imagine that your job is to find a flaw in someone else's code. Imagine plausible ways they might have screwed up and introduced a bug, and whether you can figure out a way to detect (or just trigger) that bug.
- For each function, brainstorm as many publicly observable things that must be true after the function is called as you can.
- Don't just focus on the main purpose for the function. Often there are other more peripheral conditions that should or shouldn't change as well.
- Each one of those is an opportunity for someone much less clever than you (obviously) to cause incorrect behavior.
- Write tests that check those assumptions.
Of course, you can't test every possible scenario! But there are ways to focus your attention:
- Think about “edge cases”. That is, consider behavior at the extremes of the input, as well as at transition points where behavior should change.
- Think about coverage. Do different cases trigger different behavior (and therefore probably different codepaths)? Make sure you test all of the high-level possibilities.
- Notice your own bugs. Did you cause a bug and then fix it? Consider writing a test that would catch that bug, if you don't already have one!
- Imagine other people's bugs. Even if you carefully avoided the bug, where did things get complicated or subtle? What is easy to get wrong?
(When logged in, completion status appears here.)