36
In practice, programmers use
equal?
to compare lists that contain numbers as well as symbols.
Numbers are not considered to be symbols. The question of whether two numerically equal numbers
(as tested by
=
) are also
eq?
is highly implementation-dependent. A better definition of
equal?
(such as the one that comes as a primitive in Scheme) would also stipulate that if
a
and
b
are both
numbers, then
a
and
b
are
equal?
if they are numerically equal.
37
If we want to be more formal, we can specify ‘‘consistent with the interpretations given above’’ to
mean that the operations satisfy a collection of rules such as these:
For any set
S
and any object
x
,
(element-of-set? x (adjoin-set x S))
is true
(informally: ‘‘Adjoining an object to a set produces a set that contains the object’’).
For any sets
S
and
T
and any object
x
,
(element-of-set? x (union-set S T))
is
equal to
(or (element-of-set? x S) (element-of-set? x T))
(informally:
‘‘The elements of
(union S T)
are the elements that are in
S
or in
T
’’).
For any object
x
,
(element-of-set? x ’())
is false (informally: ‘‘No object is an
element of the empty set’’).
38
Halving the size of the problem at each step is the distinguishing characteristic of logarithmic
growth, as we saw with the fast-exponentiation algorithm of section 1.2.4 and the half-interval search
method of section 1.3.3.
39
We are representing sets in terms of trees, and trees in terms of lists -- in effect, a data abstraction
built upon a data abstraction. We can regard the procedures
entry
,
left-branch
,
right-branch
, and
make-tree
as a way of isolating the abstraction of a ‘‘binary tree’’ from the
particular way we might wish to represent such a tree in terms of list structure.
40
Examples of such structures include B-trees and red-black trees. There is a large literature on data
structures devoted to this problem. See Cormen, Leiserson, and Rivest 1990.
41
Exercises 2.63-2.65 are due to Paul Hilfinger.
42
See Hamming 1980 for a discussion of the mathematical properties of Huffman codes.
[Go to first, previous, next page; contents; index]
[Go to first, previous, next page; contents; index]
2.4 Multiple Representations for Abstract Data
We have introduced data abstraction, a methodology for structuring systems in such a way that much
of a program can be specified independent of the choices involved in implementing the data objects
that the program manipulates. For example, we saw in section 2.1.1 how to separate the task of
designing a program that uses rational numbers from the task of implementing rational numbers in
terms of the computer language’s primitive mechanisms for constructing compound data. The key idea
was to erect an abstraction barrier -- in this case, the selectors and constructors for rational numbers
(
make-rat
,
numer
,
denom
) -- that isolates the way rational numbers are used from their underlying
representation in terms of list structure. A similar abstraction barrier isolates the details of the
procedures that perform rational arithmetic (
add-rat
,
sub-rat
,
mul-rat
, and
div-rat
) from
the ‘‘higher-level’’ procedures that use rational numbers. The resulting program has the structure
shown in figure 2.1.
These data-abstraction barriers are powerful tools for controlling complexity. By isolating the
underlying representations of data objects, we can divide the task of designing a large program into
smaller tasks that can be performed separately. But this kind of data abstraction is not yet powerful
enough, because it may not always make sense to speak of ‘‘the underlying representation’’ for a data
object.
For one thing, there might be more than one useful representation for a data object, and we might like
to design systems that can deal with multiple representations. To take a simple example, complex
numbers may be represented in two almost equivalent ways: in rectangular form (real and imaginary
parts) and in polar form (magnitude and angle). Sometimes rectangular form is more appropriate and
sometimes polar form is more appropriate. Indeed, it is perfectly plausible to imagine a system in
which complex numbers are represented in both ways, and in which the procedures for manipulating
complex numbers work with either representation.
More importantly, programming systems are often designed by many people working over extended
periods of time, subject to requirements that change over time. In such an environment, it is simply not
possible for everyone to agree in advance on choices of data representation. So in addition to the
data-abstraction barriers that isolate representation from use, we need abstraction barriers that isolate
different design choices from each other and permit different choices to coexist in a single program.
Furthermore, since large programs are often created by combining pre-existing modules that were
designed in isolation, we need conventions that permit programmers to incorporate modules into larger
systems additively, that is, without having to redesign or reimplement these modules.
In this section, we will learn how to cope with data that may be represented in different ways by
different parts of a program. This requires constructing generic procedures -- procedures that can
operate on data that may be represented in more than one way. Our main technique for building
generic procedures will be to work in terms of data objects that have type tags, that is, data objects that
include explicit information about how they are to be processed. We will also discuss data-directed
programming, a powerful and convenient implementation strategy for additively assembling systems
with generic operations.
We begin with the simple complex-number example. We will see how type tags and data-directed
style enable us to design separate rectangular and polar representations for complex numbers while
maintaining the notion of an abstract ‘‘complex-number’’ data object. We will accomplish this by
defining arithmetic procedures for complex numbers (
add-complex
,
sub-complex
,