CS 70

Analyzing Randomized Trees

Worst Case

Randomized trees sometimes keep themselves balanced by moving an inserted item up to become the root of one of the subtrees of the tree.

  • Duck speaking

    But… couldn't that mean that we sometimes… build a stick?

  • LHS Cow speaking

    In theory, yes. Randomized trees, just like random trees, don't do anything about the worst-case possibility for trees.

  • RHS Cow speaking

    But in practice, a stick is fantastically unlikely.

The worst-case complexity of lookup/insert/remove in a randomized BST is \( \Theta(n) \). However the worst case is fantastically unlikely!

The chance a randomized BST with \( n \) nodes will be a stick is \( 2^{n-1}/n! \).

For a 31-node tree, the chances are about 7,658,115,266,056,659,461,674,804 to 1, that's nearly 8 septillion, or 8 million billion billion.

It's actually far more likely the tree would come out perfectly balanced, at only 109,876,902,975 to 1.

  • LHS Cow speaking

    So the theoretical worst case takes into account vanishingly unlikely possibilities.

  • RHS Cow speaking

    It's more likely that your computer will catch fire than that either of those things will happen.

  • LHS Cow speaking

    And that's for a really tiny tree. For larger trees, the probability of a pathological tree shape is even more unimaginably unlikely.

Expected Case

No matter what order keys are inserted in, randomized trees build a random tree. Thus all the properties of random trees apply.

Note that this result is stronger than the one we derived for the random trees generated by basic (unbalanced) BSTs. In that case, we needed a particular probability distribution over key orderings (uniform random). Here, because the randomness is built into the tree, we can always make this expected-complexity promise no matter how keys are inserted.

Thus, we can expect a randomized BST will be untidy but “balanced enough”, with \( \Theta( \log{n} ) \) height, and average node depths that are 38.6% worse than a completely perfect tree.

Summary

Here are the key complexity facts about randomized trees:

  • Best-case lookup/insert/remove: \( \Theta(1) \); for example,
    • lookup, when desired key is the root of the tree (1 in \( n \) chance of this case).
    • insert, when root of the tree is the maximum (1 in \( n \) chance of this case) and inserted key greater value (e.g., when inserting items in order).
  • Worst-case lookup/insert/remove: \( \Theta(n) \)
    • An entirely theoretical result, since anything even vaguely stick-like is vanishingly improbable in practice.
  • Expected-case lookup/insert/remove: \( \Theta( \log{n} ) \)

Can we say that the worst-case cost of insertion in a randomized tree is \( \Theta( \log{n} ) \)?

(When logged in, completion status appears here.)