CS 70

C++ Implementation: Insert Alternatives

  • Duck speaking

    I don't think the insert we wrote is very efficient. It calls exists first and searches the tree, and then goes over the tree a second time to actually insert the data.

  • LHS Cow speaking

    Okay… Let's explore that!

Our code for insert looks like this:

// insert k, no-op if the key is already in the tree
void IntTree::insert(int k) {
    if (!exists(k)) { // Don't insert keys already in the tree
        insertHelper(root_, k);
    }
}

Our version of insert calls exists (which we've not written yet but will traverse the tree similarly to insert) to make sure the item isn't already in the tree before we use our recursive helper function.

Does the use of exists change the asymptotic complexity of our insert algorithm?

  • Duck speaking

    Okay, sure, it's a constant-factor difference. But if I do twice as much work, my program will only go at half the speed!

  • LHS Cow speaking

    Not really. It's only tree insertion we're talking about, not the whole program; and, in practice, on modern machines the speed difference is not likely to be anything like a factor of two. In general, when people make guesses about how fast code will run, they're usually wrong.

  • Duck speaking

    But why not make it faster anyway. Faster is better, right?

  • LHS Cow speaking

    Speed isn't the only thing to consider when writing code.

What's a benefit of using the exists check before calling insertHelper?

Making sure we actually need to insert something into the tree before calling insertHelper makes the code for insertHelper simpler. The difference is modest in this case, but it can make more of a difference for some of the other insertion algorithms we'll see in future lessons.

Other Options

  • Pig speaking

    But can we see MORE ways to do it.

  • Duck speaking

    For efficiency!!

  • LHS Cow speaking

    Okay.

Undefined Behavior

Here's the simplest solution. Shift the burden onto the user. Tell them they just aren't allowed to insert the same thing twice. If they might do that, it'll be up to them to call exists first, not our code.

// It is undefined behavior to insert something into the tree
// that's already there.
void IntTree::insert(int k) {
    insertHelper(root_, k);
}

Checking in insertHelper

Or, we could make insertHelper handle finding that the desired value is already in the tree.

// This code can handle the item being in the tree
// It now returns a bool to indicate whether it actually
// inserted anything.
bool IntTree::insertHelper(Node*& tree, int k) {
    if (tree == nullptr) {
        tree = new Node{k, nullptr, nullptr};
        return true;
    }
    else if (k < tree->key_) {
        return insertHelper(tree->left_, k);
    }
    else if (k > tree->key_)  {
        return insertHelper(tree->right_, k);
    }
    else { // k == tree->key_
        return false;
    }
}

This version returns true if it actually inserted something, and false if it found it already there.

  • Cat speaking

    Why did you make it return a bool, you're not using that result when you call it from insert?

  • LHS Cow speaking

    Because it's good practice for this style of helper. More complicated tree-insertion algorithms may need to know whether or not something was inserted.

  • RHS Cow speaking

    But honestly, when you're writing code for this class, it's easier if you don't add complexity to those algorthms when you don't have to.

Which will you choose…?

(When logged in, completion status appears here.)