Empirically Timing Programs
I'm so excited that we all got to (virtually) attend the BarnComp conference this year!
Getting to hear from all of the researchers who are working on new ways to sort Barns was sooooooo cool!
Were you at that session this morning where Professor McDonald presented WildSort? It can sort a list of 100,000 Barn names in 0.16 seconds!
Hay guys! Sorry I'm late! I just got out of a talk by Professor McGregor. It was all about a new algorithm for sorting Barn names called SneakerSort. SneakerSort can sort a list of 100,000 Barn names in 0.03 seconds!!!
Wow, that sounds like it's even faster than WildSort!
Guess it's a good thing Professor McDonald presented this morning. Who would ever use WildSort now?!
There's a lot of information we still don't know about the results presented by Profs. McDonald and McGregor.
For example…
- What list of names did they sort? Was one of them already sorted?
- What computer did they run their algorithms on? Did Prof. McGregor have access to an NSA supercomputer or something?
- What else was running at the same time as the algorithms? Was Prof. McDonald's test done by a grad student who was simultaneously streaming ten different movies on Netflix?
- What languages were their algorithms implemented in? If Prof. McDonald was using Python, then the test results might be affected by Python having a longer start-up time since it's interpreted. What would happen if they were both implemented in C++?
- What if we don't have exactly 100,000 Barns to sort? How do they do on 5,000 Barns? Or 5,000,000 Barns?
Yes, a lot that we don't know! There's not much we can conclude from single data points taken out of context!
In short, empirical testing (like timing a program) can tell us something about
- A particular algorithm,
- Written in a particular language,
- For a particular program,
- Compiled using a particular version of a particular compiler,
- With particular settings (e.g., enabling / disabling optimizations),
- Running on a particular data set, of a particular size,
- On a particular computer,
- With particular resources (CPUs, memory, hard drive, …),
- Under a particular version of a particular operating system,
- In a particular environment (e.g., with other programs running in the background).
The Role of Empirical Testing
Got it. Timing programs is pointless!
No, no. It's just limited!
Sometimes you really do want to know about runtime under specific circumstances.
Sometimes you really have good reason to believe that you can extrapolate or interpolate the results to similar circumstances.
The nice thing about timing a program is that it gets you a very concrete answer: “It took this long.”
That's very, very close to the question we often want to answer, which is “How long will it take?” The gap between the two is how well we can or cannot extrapolate from the results we have to the situation we want to know about.
As with any empirical findings, we have to be careful about the conclusions we draw and the predictions we make!
Sometimes, with sufficient expertise/experience/information, we might be willing to extrapolate to at least a ballpark estimate.
Sometimes, with sufficient expertise/experience/information, we'll know that such an extrapolation would be unsafe or unreliable!
In order for an empirical comparison to be informative…
- The conditions of the experiment should be carefully and thoroughly documented;
- The conditions should, as much as possible, be the same for the algorithms being compared;
- The impact of experimental noise should be mitigated with proper statistical analysis; and,
- A variety of experiments should be run in order to study sensitivity to experimental conditions.
Meh. That sounds like natural science. I'm a computer-scientist goat, not a biologist goat!
Well, yes! Sometimes computer scientists use natural-science methods to study artificial things!
Sometimes it's the best or even the only way we can think of to answer a question.
When we do, we have to use empiricism responsibly, just like all the other scientists.
As in any other science, empirical findings gain more meaning when they are accompanied by theoretical findings that can contextualize the data and maybe give us a model for ways that our empirical conclusions might or might not apply to new experimental conditions.
Most of our attention this semester will be on some of the theoretical tools that computer scientists use to reason about the efficiency of algorithms. But it is not uncommon for computer scientists to combine theoretical and empirical work to get a more complete picture.
(When logged in, completion status appears here.)