Splay-Tree Complexity (Initial Exploration)
Meh. Okay, it does well with repeated lookups. But they can't all be repeats!
Fair enough. Not every operation will be looking up a recently added/found key.
Yeah, so how bad can it get?
Let's explore!
Return to the
and start fresh. (If you already have it open, refresh the page to reset to an empty tree).
Now insert the following keys:
- 20
- 30
- 40
- 60
- 80
I know trees, mate, and that is a bodgy tree. It's not balanced!
Yup. As it turns out, it's possible for a splay tree to be a stick.
A red–black tree promises that the height of the tree is always \( \Theta( \log{n} ) \).
A splay tree doesn't make that promise. It is possible for a splay tree to become unbalanced, with a \( \Theta(n) \) height.
At the same time, even though the tree has grown unbalanced, notice that all of the insertions we've done so far have been cheap: Each node has been inserted as a child of the root and only had to do one rotation to become the new root.
Now let's try an expensive insert. Let's insert something close to the bottom.
Insert 10.
Crikey, that took all arvo. That was defo \( \Theta(n) \) time.
That's true, but the tree is more balanced now.
Fair dinkum. So the next lookup won't take so long. No worries!
Worst-Case Time
We've seen that a splay tree can be a stick.
As a result, the worst-case time of both insert and lookup is \( \Theta(n) \).
Best-Case Time
We know that the best-case time of lookup is \( \Theta(1) \), because the key can be in the root.
Also, since a splay tree can be a stick, we've seen that insert can also take \( \Theta(1) \) time, if the key is inserted as a child of the root.
So the best-case time of both insert and lookup is \( \Theta(1) \).
Overall
More to the Story
In a simple BST, it is possible for every insertion to take \( \Theta(n) \) time. All we have to do is keep adding to the bottom of a stick.
So, if we insert \( m \) keys, in the worst case it would take \( \Theta( 1 + 2 + 3 + \cdots{} + m ) = \Theta( m^2 ) \).
A Red–Black tree promises that the height of the tree is always \( \Theta( \log{n} ) \). Thus insertion always takes \( \Theta( \log{n} ) \) time, no matter what.
So, if we insert \( m \) keys, in the worst case it would take \( \Theta( \log{1} + \log{2} + \log{3} + \cdots{} + \log{m} ) = \Theta( m\log{m} )\) time.
Where do splay trees fit in?
There's no getting around the fact that it's possible for a single operation to take linear time. Clearly the worst-case time of the Red–Black tree is better.
However, we also saw that
- An expensive (linear-time) operation was only possible after a bunch of cheap operations made the tree unbalanced; and.
- After an expensive operation, the splay operation makes the tree more balanced, so operations will be cheap for a while… until the tree becomes unbalanced again.
So while it's true that a single insertion could take linear time, we can't seem to make every insertion take linear time.
I'm getting flashbacks to Quinn's train. We couldn't make that one keep hitting its worst case either.
Even though they have the same worst-case complexity as a simple (unbalanced) BST for a single insert operation, it seems like inserting \( m \) keys into a splay tree might be more efficient. Can we capture that difference in our complexity analysis?
Hay! I looked up splay-tree complexity on Wikipedia and saw something weird.
Oh? What did you see?
This:
It says \( \mathrm{O}( \log{n} ) \) complexity! Hooray!
Not quite. It still says that the worst case is linear, which is what we proved for ourselves.
Huh. So what does “amortized” mean?
That is the subject of the rest of this lesson!
Oh…
(When logged in, completion status appears here.)