few exemplars are needed inside stable regions. For example, you
might expect
the required density of exemplars that lie well inside class boundaries to be
much less than the density that is needed near class boundaries. Deciding which
instances to save and which to discard is another key problem in instance-based
learning.
An apparent drawback to instance-based representations is that they do not
make explicit the structures that are learned. In a sense this violates the notion
of “learning” that we presented at the beginning of this book; instances do not
really “describe” the patterns in data. However, the instances combine with the
distance metric to carve out boundaries in instance space that distinguish one
class from another, and this is a kind of explicit representation of knowledge.
For example, given a single instance of each of two classes, the nearest-neigh-
bor rule effectively splits the instance space along the perpendicular bisector of
the line joining the instances. Given several instances of each class, the space is
divided by a set of lines that represent the perpendicular bisectors of selected
lines joining an instance of one class to one of another class. Figure 3.8(a) illus-
trates a nine-sided polygon that separates the filled-circle class from the open-
circle class. This polygon is implicit in the operation of the nearest-neighbor
rule.
When training instances are discarded, the result is to save just a few proto-
typical examples of each class. Figure 3.8(b) shows as dark circles only the
examples that actually get used in nearest-neighbor decisions: the others (the
light gray ones) can be discarded without affecting the result. These prototypi-
cal examples serve as a kind of explicit knowledge representation.
Some instance-based representations go further and explicitly generalize the
instances. Typically, this is accomplished by creating rectangular regions that
enclose examples of the same class. Figure 3.8(c) shows the rectangular regions
that might be produced. Unknown examples that fall within one of the rectan-
gles will be assigned the corresponding class; ones that fall outside all rectan-
gles will be subject to the usual nearest-neighbor rule. Of course this produces
3 . 8
I N S TA N C E - BA S E D R E P R E S E N TAT I O N
7 9
(a)
(b)
(c)
(d)
Figure 3.8 Different ways of partitioning the instance space.
P088407-Ch003.qxd 4/30/05 11:09 AM Page 79
different decision boundaries from the straightforward
nearest-neighbor rule,
as can be seen by superimposing the polygon in Figure 3.8(a) onto the rectan-
gles. Any part of the polygon that lies within a rectangle will be chopped off and
replaced by the rectangle’s boundary.
Rectangular generalizations in instance space are just like rules with a special
form of condition, one that tests a numeric variable against an upper and lower
bound and selects the region in between. Different dimensions of the rectangle
correspond to tests on different attributes being ANDed together. Choosing
snugly fitting rectangular regions as tests leads to much more conservative rules
than those generally produced by rule-based machine learning methods,
because for each boundary of the region, there is an actual instance that lies on
(or just inside) that boundary. Tests such as x
<
a (where
x is an attribute value
and a is a constant) encompass an entire half-space—they apply no matter how
small x is as long as it is less than a. When doing rectangular generalization in
instance space you can afford to be conservative because if a new example is
encountered that lies outside all regions, you can fall back on the nearest-neigh-
bor metric. With rule-based methods the example cannot be classified, or
receives just a default classification, if no rules apply to it. The advantage of more
conservative rules is that, although incomplete, they may be more perspicuous
than a complete set of rules that covers all cases. Finally, ensuring that the
regions do not overlap is tantamount to ensuring that at most one rule can apply
to an example, eliminating another of the difficulties of rule-based systems—
what to do when several rules apply.
A more complex kind of generalization is to permit rectangular regions to
nest one within another. Then a region that is basically all one class can contain
an inner region of a different class, as illustrated in Figure 3.8(d). It is possible
to allow nesting within nesting so that the inner region can itself contain its own
inner region of a different class—perhaps the original class of the outer region.
This is analogous to allowing rules to have exceptions and exceptions to the
exceptions, as in Section 3.5.
It is worth pointing out a slight danger to the technique of visualizing
instance-based learning in terms of boundaries in example space: it makes the
implicit assumption that attributes are numeric rather than nominal. If the
various values that a nominal attribute can take on were laid out along a
line, generalizations involving a segment of that line would make no sense: each
test involves either one value for the attribute or all values for it (or perhaps an
arbitrary subset of values). Although you can more or less easily imagine extend-
ing the examples in Figure 3.8 to several dimensions, it is much harder to
imagine how rules involving nominal attributes will look in multidimensional
instance space. Many machine learning situations involve numerous attributes,
and our intuitions tend to lead us astray when extended to high-dimensional
spaces.
8 0
C H A P T E R 3
|
O U T P U T: K N OW L E D G E R E P R E S E N TAT I O N
P088407-Ch003.qxd 4/30/05 11:09 AM Page 80