5
Surprisingly, this idea is very difficult to formulate rigorously. There are two approaches to giving
such a formulation. One, pioneered by C. A. R. Hoare (1972), is known as the method of abstract
models. It formalizes the ‘‘procedures plus conditions’’ specification as outlined in the
rational-number example above. Note that the condition on the rational-number representation was
stated in terms of facts about integers (equality and division). In general, abstract models define new
kinds of data objects in terms of previously defined types of data objects. Assertions about data objects
can therefore be checked by reducing them to assertions about previously defined data objects.
Another approach, introduced by Zilles at MIT, by Goguen, Thatcher, Wagner, and Wright at IBM
(see Thatcher, Wagner, and Wright 1978), and by Guttag at Toronto (see Guttag 1977), is called
algebraic specification. It regards the ‘‘procedures’’ as elements of an abstract algebraic system whose
behavior is specified by axioms that correspond to our ‘‘conditions,’’ and uses the techniques of
abstract algebra to check assertions about data objects. Both methods are surveyed in the paper by
Liskov and Zilles (1975).
[Go to first, previous, next page; contents; index]
[Go to first, previous, next page; contents; index]
2.2 Hierarchical Data and the Closure Property
As we have seen, pairs provide a primitive ‘‘glue’’ that we can use to construct compound data
objects. Figure 2.2 shows a standard way to visualize a pair -- in this case, the pair formed by
(cons
1 2)
. In this representation, which is called box-and-pointer notation, each object is shown as a
pointer to a box. The box for a primitive object contains a representation of the object. For example,
the box for a number contains a numeral. The box for a pair is actually a double box, the left part
containing (a pointer to) the
car
of the pair and the right part containing the
cdr
.
We have already seen that
cons
can be used to combine not only numbers but pairs as well. (You
made use of this fact, or should have, in doing exercises 2.2 and 2.3.) As a consequence, pairs provide
a universal building block from which we can construct all sorts of data structures. Figure 2.3 shows
two ways to use pairs to combine the numbers 1, 2, 3, and 4.
Figure 2.2: Box-and-pointer representation of
(cons 1 2)
.
Figure 2.2: Box-and-pointer representation of
(cons 1 2)
.
Figure 2.3: Two ways to combine 1, 2, 3, and 4 using pairs.
Figure 2.3: Two ways to combine 1, 2, 3, and 4 using pairs.
The ability to create pairs whose elements are pairs is the essence of list structure’s importance as a
representational tool. We refer to this ability as the closure property of
cons
. In general, an operation
for combining data objects satisfies the closure property if the results of combining things with that
operation can themselves be combined using the same operation.
6
Closure is the key to power in any
means of combination because
it permits us to create hierarchical structures -- structures made up of
parts, which themselves are made up of parts, and so on.
From the outset of chapter 1, we’ve made essential use of closure in dealing with procedures, because
all but the very simplest programs rely on the fact that the elements of a combination can themselves
be combinations. In this section, we take up the consequences of closure for compound data. We
describe some conventional techniques for using pairs to represent sequences and trees, and we exhibit
a graphics language that illustrates closure in a vivid way.
7
2.2.1 Representing Sequences
Figure 2.4: The sequence 1, 2, 3, 4 represented as a chain of pairs.
Figure 2.4: The sequence 1, 2, 3, 4 represented as a chain of pairs.
One of the useful structures we can build with pairs is a sequence -- an ordered collection of data
objects. There are, of course, many ways to represent sequences in terms of pairs. One particularly
straightforward representation is illustrated in figure 2.4, where the sequence 1, 2, 3, 4 is represented
as a chain of pairs. The
car
of each pair is the corresponding
item in the chain, and the
cdr
of the
pair is the next pair in the chain. The
cdr
of the final pair signals the end of the sequence by pointing
to a distinguished value that is not a pair, represented in box-and-pointer diagrams as a diagonal line
and in programs as the value of the variable
nil
. The entire sequence is constructed by nested
cons
operations:
(cons 1
(cons 2
(cons 3
(cons 4 nil))))
Such a sequence of pairs, formed by nested
cons
es, is called a
list, and Scheme provides a primitive
called
list
to help in constructing lists.
8
The above sequence could be produced by
(list 1 2 3
4)
. In general,
(list <
a
1
> <a
2
> ... <a
n
>)
is equivalent to
(cons <
a
1
> (cons <a
2
> (cons ... (cons <a
n
> nil) ...)))
Lisp systems conventionally print lists by printing the sequence of elements, enclosed in parentheses.
Thus, the data object in figure 2.4 is printed as
(1 2 3 4)
:
(define one-through-four (list 1 2 3 4))
one-through-four
(1 2 3 4)