Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Yüklə 4,3 Mb.

Pdf görüntüsü

səhifə	39/219
tarix	08.10.2017
ölçüsü	4,3 Mb.
	#3816

1 ... 35 36 37 38 39 40 41 42 ... 219

Then it is necessary to break the symmetry and choose a single test for the root

node. If, for example, a is chosen, the second rule must, in effect, be repeated

twice in the tree, as shown in Figure 3.2. This is known as the replicated subtree

problem.

The replicated subtree problem is sufﬁciently important that it is worth

looking at a couple more examples. The diagram on the left of Figure 3.3 shows

an exclusive-or function for which the output is a if x

= 1 or y = 1 but not both.

To make this into a tree, you have to split on one attribute ﬁrst, leading to a

structure like the one shown in the center. In contrast, rules can faithfully reﬂect

the true symmetry of the problem with respect to the attributes, as shown on

the right.

6 6

C H A P T E R 3

O U T P U T: K N OW L E D G E R E P R E S E N TAT I O N

Figure 3.2 Decision tree for a simple disjunction.

P088407-Ch003.qxd 4/30/05 11:09 AM Page 66

In this example the rules are not notably more compact than the tree. In fact,

they are just what you would get by reading rules off the tree in the obvious

way. But in other situations, rules are much more compact than trees, particu-

larly if it is possible to have a “default” rule that covers cases not speciﬁed by the

other rules. For example, to capture the effect of the rules in Figure 3.4—in

which there are four attributes, x, y, z, and w, that can each be 1, 2, or 3—requires

the tree shown on the right. Each of the three small gray triangles to the upper

right should actually contain the whole three-level subtree that is displayed in

gray, a rather extreme example of the replicated subtree problem. This is a dis-

tressingly complex description of a rather simple concept.

One reason why rules are popular is that each rule seems to represent an inde-

pendent “nugget” of knowledge. New rules can be added to an existing rule set

without disturbing ones already there, whereas to add to a tree structure may

require reshaping the whole tree. However, this independence is something of

an illusion, because it ignores the question of how the rule set is executed. We

explained earlier (on page 11) the fact that if rules are meant to be interpreted

in order as a “decision list,” some of them, taken individually and out of context,

may be incorrect. On the other hand, if the order of interpretation is supposed

to be immaterial, then it is not clear what to do when different rules lead to dif-

ferent conclusions for the same instance. This situation cannot arise for rules

that are read directly off a decision tree because the redundancy included in the

structure of the rules prevents any ambiguity in interpretation. But it does arise

when rules are generated in other ways.

If a rule set gives multiple classiﬁcations for a particular example, one solu-

tion is to give no conclusion at all. Another is to count how often each rule ﬁres

on the training data and go with the most popular one. These strategies can lead

3 . 3

C L A S S I F I C AT I O N RU L E S

6 7

x = 1 ?

y = 1 ?

yes

If x=1 and y=0 then class = a

If x=0 and y=1 then class = a

If x=0 and y=0 then class = b

If x=1 and y=1 then class = b

Figure 3.3 The exclusive-or problem.

P088407-Ch003.qxd 4/30/05 11:09 AM Page 67

to radically different results. A different problem occurs when an instance is

encountered that the rules fail to classify at all. Again, this cannot occur with

decision trees, or with rules read directly off them, but it can easily happen with

general rule sets. One way of dealing with this situation is to fail to classify such

an example; another is to choose the most frequently occurring class as a default.

Again, radically different results may be obtained for these strategies. Individ-

ual rules are simple, and sets of rules seem deceptively simple—but given just

a set of rules with no additional information, it is not clear how it should be

interpreted.

A particularly straightforward situation occurs when rules lead to a class that

is Boolean (say, yes and no) and when only rules leading to one outcome (say,

yes) are expressed. The assumption is that if a particular instance is not in class

6 8

C H A P T E R 3

O U T P U T: K N OW L E D G E R E P R E S E N TAT I O N

If x=1 and y=1 then class = a

If z=1 and w=1 then class = a

Otherwise class = b

Figure 3.4 Decision tree with a replicated subtree.

P088407-Ch003.qxd 4/30/05 11:09 AM Page 68

Yüklə 4,3 Mb.

Dostları ilə paylaş:

1 ... 35 36 37 38 39 40 41 42 ... 219