ically, the exception-based rules can very simply be
rewritten in terms of regular
if . . . then . . . else clauses. What is gained by the formulation in terms of excep-
tions is not logical but psychological. We assume that the defaults and the tests
that occur early apply more widely than the exceptions further down. If this is
indeed true for the domain, and the user can see that it is plausible, the expres-
sion in terms of (common) rules and (rare) exceptions will be easier to grasp
than a different, but logically equivalent, structure.
3.6 Rules involving relations
We have assumed implicitly that the conditions in rules involve testing an
attribute value against a constant. Such rules are called propositional because the
attribute-value language used to define them has the same power as what logi-
cians call the propositional calculus. In many classification tasks, propositional
rules are sufficiently expressive for concise, accurate concept descriptions. The
weather, contact lens recommendation, iris type, and acceptability of labor con-
tract datasets mentioned previously, for example, are well described by propo-
sitional rules. However, there are situations in which a more expressive form of
rule would provide a more intuitive and concise concept description, and these
are situations that involve relationships between examples such as those encoun-
tered in Section 2.2.
Suppose, to take a concrete example, we have the set of eight building blocks
of the various shapes and sizes illustrated in Figure 3.6, and we wish to learn
the concept of standing. This is a classic two-class problem with classes stand-
ing and lying. The four shaded blocks are positive (standing) examples of the
concept, and the unshaded blocks are negative (lying) examples. The only infor-
3 . 6
RU L E S I N VO LV I N G R E L AT I O N S
7 3
Shaded:
Unshaded:
standing
lying
Figure 3.6 The shapes problem.
P088407-Ch003.qxd 4/30/05 11:09 AM Page 73
mation the learning algorithm will be given is the
width, height, and
number of
sides of each block. The training data is shown in
Table 3.2.
A propositional rule set that might be produced for this data is:
if width
≥ 3.5 and height < 7.0 then lying
if height
≥ 3.5 then standing
In case you’re wondering, 3.5 is chosen as the breakpoint for width because it is
halfway between the width of the thinnest lying block, namely 4, and the width
of the fattest standing block whose height is less than 7, namely 3. Also, 7.0 is
chosen as the breakpoint for height because it is halfway between the height of
the tallest lying block, namely 6, and the shortest standing block whose width
is greater than 3.5, namely 8. It is common to place numeric thresholds halfway
between the values that delimit the boundaries of a concept.
Although these two rules work well on the examples given, they are not very
good. Many new blocks would not be classified by either rule (e.g., one with
width 1 and height 2), and it is easy to devise many legitimate blocks that the
rules would not fit.
A person classifying the eight blocks would probably notice that
“standing blocks are those that are taller than they are wide.” This rule does
not compare attribute values with constants, it compares attributes with each
other:
if width
> height then lying
if height
> width then standing
The actual values of the height and width attributes are not important; just the
result of comparing the two. Rules of this form are called relational, because
they express relationships between attributes, rather than propositional, which
denotes a fact about just one attribute.
7 4
C H A P T E R 3
|
O U T P U T: K N OW L E D G E R E P R E S E N TAT I O N
Table 3.2
Training data for the shapes problem.
Width
Height
Sides
Class
2
4
4
standing
3
6
4
standing
4
3
4
lying
7
8
3
standing
7
6
3
lying
2
9
4
standing
9
1
4
lying
10
2
3
lying
P088407-Ch003.qxd 4/30/05 11:09 AM Page 74
Standard relations include equality (and inequality) for nominal attributes
and less than and greater than for numeric ones. Although relational nodes
could be put into decision trees just as relational conditions can be put into
rules, schemes that accommodate relations generally use the rule rather than the
tree representation. However, most machine learning methods do not consider
relational rules because there is a considerable cost in doing so. One way of
allowing a propositional method to make use of relations is to add extra, sec-
ondary attributes that say whether two primary attributes are equal or not, or
give the difference between them if they are numeric. For example, we might
add a binary attribute is width
<
height? to Table 3.2. Such
attributes are often
added as part of the data engineering process.
With a seemingly rather small further enhancement, the expressive
power of the relational knowledge representation can be extended very
greatly. The trick is to express rules in a way that makes the role of the instance
explicit:
if width(block)
> height(block) then lying(block)
if height(block)
> width(block) then standing(block)
Although this does not seem like much of an extension, it is if instances can
be decomposed into parts. For example, if a tower is a pile of blocks, one on top
of the other, then the fact that the topmost block of the tower is standing can
be expressed by:
if height(tower.top)
> width(tower.top) then standing(tower.top)
Here, tower.top is used to refer to the topmost block. So far, nothing has been
gained. But if tower.rest refers to the rest of the tower, then the fact that the tower
is composed entirely of standing blocks can be expressed by the rules:
if height(tower.top)
> width(tower.top) and standing(tower.rest)
then standing(tower)
The apparently minor addition of the condition standing(tower.rest) is a recur-
sive expression that will turn out to be true only if the rest of the tower is com-
posed of standing blocks. A recursive application of the same rule will test this.
Of course, it is necessary to ensure that the recursion “bottoms out” properly
by adding a further rule, such as:
if tower
= empty then standing(tower.top)
With this addition, relational rules can express concepts that cannot possibly be
expressed propositionally, because the recursion can take place over arbitrarily
long lists of objects. Sets of rules such as this are called logic programs, and this
area of machine learning is called inductive logic programming. We will not be
treating it further in this book.
3 . 6
RU L E S I N VO LV I N G R E L AT I O N S
7 5
P088407-Ch003.qxd 4/30/05 11:09 AM Page 75