CS 70

What's an Interface?

Often, we want to describe what we want a data structure to be able to do—the operations it supports. When trying to specify these kinds of fundamental ideas, it's appropriate to do so in a way that is independent of the specific language we're working on, and to specify what we want to happen without saying how it should all work internally.

This description is often just called “the interface”, or the “fundamental operations”. In the last lesson, we discussed the fundamental operations that iterators have to provide, no matter what programming language they're written for.

List the fundamental operations of a stack data structure.

The Stack: Fundamental Operations

The fundamental operations of a stack are probably

  • PUSH(stack, elem)—push an element onto the stack, it becomes the new item at the top of the stack.
  • POP(stack)—removes the top element from the stack (and returns it)
  • IS-EMPTY(stack)—determines whether the stack is empty.

(Notice that we've written the fundamental operations in SMALL-CAPS-TEXT-FONT to make it clear that we're not being C++ specific.)

Perhaps you had other operations in mind? Some other possible (if less-fundamental) options are

  • SIZE(stack)—determine the size of the stack.
  • TOP(stack)—provides access to the top element of the stack
  • MAKE-ITERATOR(stack)—produces an iterator that can go through the elements of the stack in some order.
  • Cat speaking

    This all seems a bit vague…

  • LHS Cow speaking

    It is. People agree on the general idea, but even when we try to specify the behavior of a data structure independent of language, there is usually some wiggle room, and different choices cause different trade-offs.

  • Duck speaking

    Why didn't you use object-oriented syntax, like stack.SIZE()?

  • LHS Cow speaking

    Because, again, the interface is abstract and conceptual, so we make it more general, rather than tying an operation to a particular language feature.

Costs of Fundamental Operations

Oftentimes, interfaces don't just stop at listing some operations. They may additionally specify some constraints on the cost of those operations.

  • Dog speaking

    Wait, cost? Are you saying we have to pay money to use a stack??

  • LHS Cow speaking

    Don't worry, that's not what we mean! By "cost", we're referring to the amount of work an operation needs to do.

In the next lesson, we'll introduce a mathematical framework for formally reasoning about cost. For the sake of the current discussion however, we can stick to a high-level approach of categorizing operations into two "tiers" of cost:

  • A constant time operation is one where the amount of work does not depend on the amount of data. To take a real-world example, the amount of work that the CS70 Profs have to do to prepare the weekly lessons doesn't depend on the number of enrolled students. Whether we have 20 students, 50 students, or 100 students, there will always be exactly 16 weeks of lessons with 2 lessons per week.
  • A linear time operation is one where the amount of work scales linearly with the amount of data. Continuing with our real-world example, the amount of work needed to grade the proficiency checks is proportional to the number of students! Each additional student represents one additional check to be graded each week. Compared to a class of 50 students, a class of 100 students would require the Profs to spend (roughly) twice as much time grading proficiency checks.
  • Duck speaking

    Oh, I know this one from CS60! Constant time means \(\mathrm{O}(1)\) and linear time means \(\mathrm{O}(n)\), right?

  • LHS Cow speaking

    That's right! But if you haven't seen this notation before, or forgot it, don't worry! You can stick to the high-level intuition for now. We'll cover that notation (and much more) in the next lesson.

  • Pig speaking

    Yay! MORE!

It's important to note that the "what-how" distinction is still in place! Saying that the stack's SIZE is constant time (for example) does not say anything about what the code will specifically look like. It simply makes a promise that the cost of calling SIZE will not grow with the number of items in the stack, leaving it up to the programmer to figure out exactly how to fulfill that promise.

That being said, such promises often do imply certain trade-offs. To make an analogy, if a product manager at Apple promises that the next iPhone will last one week on a single charge, this promise doesn't directly say anything about the phone's actual design. But in practice, fulfilling that promise will require a larger battery or more power-efficient components, and the engineers will have to rule out designs that lack those features.

What are the trade-offs in promising a constant time SIZE operation?

If we promise a constant time size operation, we need to store the size of the stack somewhere so that we can provide it in constant time. That accounting adds overhead. If we didn't have a size operation or we only promised linear time performance, we could avoid storing the size explicitly and figure it out when needed by counting how many items were on the stack. This second approach is slower, but slightly more memory efficient.

  • Bjarne speaking

    Until 2011, C++'s std::list would not specify whether its size() member function was constant time or linear. The goal was to provide implementors with flexibility, but in practice, it meant that no one could be sure what they were getting. From C++11 onward, std::list has a constant time SIZE operation. Folks who need a very memory-efficient list can use std::forward_list, which is singly linked.

What about the C++-Specific Interface?

Specific languages often have their own quirks, so, in addition to providing the fundamental operations in a language-neutral way, it is also useful to specify what the C++ interface would be (as declared in the (public sections of the) header file).

In the C++-specific interface, we might organize things differently from the conceptual operations. For example, we don't have to follow the exact same names—we might choose to call PUSH by the name push_back (or just push) and make it a member function of a class.

Similarly, whether to return items as values or by reference, and whether the functions take arguments as values or by reference, are all design questions that need to be resolved when we write the C++ interface.

You've already had lots of experience writing header files and the various comments that define this kind of interface.

(When logged in, completion status appears here.)