The
rules we have seen so far are classification rules: they predict the classifi-
cation of the example in terms of whether to play or not. It is equally possible
to disregard the classification and just look for any rules that strongly associate
different attribute values. These are called association rules. Many association
rules can be derived from the weather data in Table 1.2. Some good ones are as
follows:
If temperature
= cool
then
humidity
= normal
If humidity
= normal and windy = false then play = yes
If outlook
= sunny and play = no
then humidity
= high
If windy
= false and play = no
then outlook
= sunny
and humidity
= high.
All these rules are 100%
correct on the given data; they make no false predic-
tions. The first two apply to four examples in the dataset, the third to three
examples, and the fourth to two examples. There are many other rules: in fact,
nearly 60 association rules can be found that apply to two or more examples of
the weather data and are completely correct on this data. If you look for rules
that are less than 100% correct, then you will find many more. There are so
many because unlike classification rules, association rules can “predict” any of
the attributes, not just a specified class, and can even predict more than one
thing. For example, the fourth rule predicts both that outlook will be sunny and
that humidity will be high.
1 2
C H A P T E R 1
|
W H AT ’ S I T A L L A B O U T ?
Table 1.3
Weather data with some numeric attributes.
Outlook
Temperature
Humidity
Windy
Play
sunny
85
85
false
no
sunny
80
90
true
no
overcast
83
86
false
yes
rainy
70
96
false
yes
rainy
68
80
false
yes
rainy
65
70
true
no
overcast
64
65
true
yes
sunny
72
95
false
no
sunny
69
70
false
yes
rainy
75
80
false
yes
sunny
75
70
true
yes
overcast
72
90
true
yes
overcast
81
75
false
yes
rainy
71
91
true
no
P088407-Ch001.qxd 4/30/05 11:11 AM Page 12
Contact lenses: An idealized problem
The contact lens data introduced earlier tells you the kind of contact lens to pre-
scribe, given certain information about a patient. Note that this example is
intended for illustration only: it grossly oversimplifies the problem and should
certainly not be used for diagnostic purposes!
The first column of Table 1.1 gives the age of the patient. In case you’re won-
dering, presbyopia is a form of longsightedness that accompanies the onset of
middle age. The second gives the spectacle prescription: myope means short-
sighted and hypermetrope means longsighted. The third shows whether the
patient is astigmatic, and the fourth relates to the rate of tear production, which
is important in this context because tears lubricate contact lenses. The final
column shows which kind of lenses to prescribe: hard, soft, or none. All possi-
ble combinations of the attribute values are represented in the table.
A sample set of rules learned from this information is shown in Figure 1.1.
This is a rather large set of rules, but they do correctly classify all the examples.
These rules are complete and deterministic: they give a unique prescription for
every conceivable example. Generally, this is not the case. Sometimes there are
situations in which no rule applies; other times more than one rule may apply,
resulting in conflicting recommendations. Sometimes probabilities or weights
1 . 2
S I M P L E E X A M P L E S : T H E W E AT H E R P RO B L E M A N D OT H E R S
1 3
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no and
tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no and
tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope and
astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no and
tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes and
tear production rate = normal then recommendation = hard
If age = young and astigmatic = yes and
tear production rate = normal then recommendation = hard
If age = pre-presbyopic and
spectacle prescription = hypermetrope and astigmatic = yes
then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Figure 1.1 Rules for the contact lens data.
P088407-Ch001.qxd 4/30/05 11:11 AM Page 13