Case Study: Linear Search
Let's practice our complexity analysis skills with a really important algorithm!
It's an algorithm you use so often that you probably don't even know it has a name…
Look Up in an Array
Say we have an array of \( n \) numbers. We are given a number \( x \) and we want to determine if \( x \) is in the array or not.
Here's some code for that:
bool lookUp(float x, float* arr, size_t arrSize) {
for (size_t i = 0; i < arrSize; ++i) {
if (arr[i] == x) {
return true; // Found it!
}
}
return false; // If we are here, we've compared x to everything in the array
}
This algorithm is called "linear search."
Let's analyze it!
Best-Case Analysis
In the best case, we find x
right away! That happens when the code matches with the first element.
So we compare against exactly one element of the array and then return.
There are a bunch of operations involved (initialization, indexing, comparison, etc.), but the number of them is not related to \( n \).
Thus we can say that the best-case complexity is \( \Theta(1) \). In English, we might say that it "takes constant time" or is "order 1".
Worst-Case Analysis
In the worst case, we don't find x
at all.
In that case we compare against every element of the array and then return, which means the loop iterates \( n \) times. Each iteration of the loop takes \( \Theta(1) \) time, so we can say that the worst-case complexity is \( \Theta(n) \). In English we might say that it "takes linear time" or is "order \( n \)".
Expected-Case Analysis
So the best- and worst-case complexities are different orders! Is the function's typical behavior close to the best case, or closer to the worst case?
Remember that to perform an expected-case analysis, we have to make some assumptions about how likely different cases are.
Let's assume that
- The value of
x
is present in the array, and - The value of
x
is equally likely to be in any position in the array.
Cases: We have \( n \) cases: one for each possible position in the array. In Case 1, x
is the first element of the array; in Case 2, x
is the second element; and so on.
So for Case \( k \), we must compare against \( k \) elements in the array before finding x
.
Probabilities: Under our assumption of equal probabilities, each case has probability \( \frac{1}{n} \).
Expected complexity: We have
\[ \begin{align} & \sum^{n}_{k=1} \mathrm{Pr}( \mathrm{Case}\ 1 ) \cdot{} \mathrm{Cost}\ \cdot{} ( \mathrm{Case}\ k ) \\ & = \sum_{k=1} \left( \frac{1}{n} \right) k \\ & = \frac{1}{n} \cdot \sum^{n}_{k=1} k. \end{align} \]
Hopefully we can see where our good friend Gauss can help us here:
$$ \frac{1}{n} \cdot \sum^{n}_{k=1} k = \frac{1}{n} \cdot \frac{ n( n + 1 ) }{2} = \frac{ n + 1}{2}. $$
Thus we can see that \( \frac{ n + 1 }{2} \in \Theta(n) \).
So in this example, the expected cost is more like the worst case. It's good to know that we should just expect it to take linear time.
Meh. Is the expected-case complexity ever asymptotically better than the worst-case complexity?
Yes! Later this semester we will encounter some important examples where the worst case is bad but exceedingly rare.
In those examples the expected complexity is more like the best case than the worst.
Linear Search: Summary
We've concluded that linear search has
- \( \Theta(1) \) best-case complexity;
- \( \Theta(n) \) worst-case complexity; and
- \( \Theta(n) \) expected-case complexity, assuming that we are equally likely to find
x
in any location.
(When logged in, completion status appears here.)