Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Yüklə 4,3 Mb.

Pdf görüntüsü

səhifə	40/219
tarix	08.10.2017
ölçüsü	4,3 Mb.
	#3816

1 ... 36 37 38 39 40 41 42 43 ... 219

yes, then it must be in class no—a form of closed world assumption. If this is

the case, then rules cannot conﬂict and there is no ambiguity in rule interpre-

tation: any interpretation strategy will give the same result. Such a set of rules

can be written as a logic expression in what is called disjunctive normal form:

that is, as a disjunction (OR) of conjunctive (AND) conditions.

It is this simple special case that seduces people into assuming rules are very

easy to deal with, because here each rule really does operate as a new, inde-

pendent piece of information that contributes in a straightforward way to the

disjunction. Unfortunately, it only applies to Boolean outcomes and requires the

closed world assumption, and both these constraints are unrealistic in most

practical situations. Machine learning algorithms that generate rules invariably

produce ordered rule sets in multiclass situations, and this sacriﬁces any possi-

bility of modularity because the order of execution is critical.

3.4 Association rules

Association rules are really no different from classiﬁcation rules except that they

can predict any attribute, not just the class, and this gives them the freedom to

predict combinations of attributes too. Also, association rules are not intended

to be used together as a set, as classiﬁcation rules are. Different association rules

express different regularities that underlie the dataset, and they generally predict

different things.

Because so many different association rules can be derived from even a tiny

dataset, interest is restricted to those that apply to a reasonably large number of

instances and have a reasonably high accuracy on the instances to which they

apply to. The coverage of an association rule is the number of instances for which

it predicts correctly—this is often called its support. Its accuracy—often called

conﬁdence—is the number of instances that it predicts correctly, expressed as a

proportion of all instances to which it applies. For example, with the rule:

If temperature

= cool then humidity = normal

the coverage is the number of days that are both cool and have normal humid-

ity (4 days in the data of Table 1.2), and the accuracy is the proportion of cool

days that have normal humidity (100% in this case). It is usual to specify

minimum coverage and accuracy values and to seek only those rules whose cov-

erage and accuracy are both at least these speciﬁed minima. In the weather data,

for example, there are 58 rules whose coverage and accuracy are at least 2 and

95%, respectively. (It may also be convenient to specify coverage as a percent-

age of the total number of instances instead.)

Association rules that predict multiple consequences must be interpreted

rather carefully. For example, with the weather data in Table 1.2 we saw this rule:

3 . 4

A S S O C I AT I O N RU L E S

6 9

P088407-Ch003.qxd 4/30/05 11:09 AM Page 69

If windy

= false and play = no then outlook = sunny

and humidity

= high

This is not just a shorthand expression for the two separate rules:

If windy

= false and play = no then outlook = sunny

If windy

= false and play = no then humidity = high

It indeed implies that these exceed the minimum coverage and accuracy

ﬁgures—but it also implies more. The original rule means that the number of

examples that are nonwindy, nonplaying, with sunny outlook and high humidity,

is at least as great as the speciﬁed minimum coverage ﬁgure. It also means that

the number of such days, expressed as a proportion of nonwindy, nonplaying days,

is at least the speciﬁed minimum accuracy ﬁgure. This implies that the rule

If humidity

= high and windy = false and play = no

then outlook

= sunny

also holds, because it has the same coverage as the original rule, and its accu-

racy must be at least as high as the original rule’s because the number of high-

humidity, nonwindy, nonplaying days is necessarily less than that of nonwindy,

nonplaying days—which makes the accuracy greater.

As we have seen, there are relationships between particular association

rules: some rules imply others. To reduce the number of rules that are produced,

in cases where several rules are related it makes sense to present only the

strongest one to the user. In the preceding example, only the ﬁrst rule should

be printed.

3.5 Rules with exceptions

Returning to classiﬁcation rules, a natural extension is to allow them to have

exceptions. Then incremental modiﬁcations can be made to a rule set by express-

ing exceptions to existing rules rather than reengineering the entire set. For

example, consider the iris problem described earlier. Suppose a new ﬂower was

found with the dimensions given in Table 3.1, and an expert declared it to be

an instance of Iris setosa. If this ﬂower was classiﬁed by the rules given in Chapter

1 (pages 15–16) for this problem, it would be misclassiﬁed by two of them:

7 0

C H A P T E R 3

O U T P U T: K N OW L E D G E R E P R E S E N TAT I O N

Table 3.1

A new iris ﬂower.

Sepal length (cm)

Sepal width (cm)

Petal length (cm)

Petal width (cm)

Type

5.1

3.5

2.6

0.2

P088407-Ch003.qxd 4/30/05 11:09 AM Page 70

Yüklə 4,3 Mb.

Dostları ilə paylaş:

1 ... 36 37 38 39 40 41 42 43 ... 219