Chapter 5:
Association Rules
79
Figure 5-3. Selection
of attributes to include
in the association rules model.
5)
One other step is needed in our data preparation. This is to change the data types of our
selected attributes from integer to binominal. As previously mentioned, the association
rules operators need this data type in order to function properly. In the search box on the
Operators tab in design view, type ‘Numerical to’ (without the single quotes) to locate the
operators that will change attributes with a numeric data type to some other data type. The
one we will use is Numerical to Binominal. Drag this operator into your stream.
Data Mining
for the Masses
80
Figure 5-4. Adding a data type converstion operator to a data mining model.
6)
For our purposes, all attributes which remain after application of the Select Attributes
operator need to be converted from numeric to binominal, so as the black arrow indicates
in Figure 5-4, we will convert ‘all’ from the former data type to the latter. We could
convert a subset or a single attribute, by selecting one of those options in the attribute filter
type dropdown menu. We have done this in the past, but in this example, we can accept
the default and covert all attributes at once. You should also observe that within
RapidMiner, the data type
binominal is used instead of
binomial, a term many data
analysts are more used to. There is an important distinction.
Binomial means one of two
numbers (usually 0 and 1), so the basic underlying data type is still numeric.
Binominal on
the other hand, means one of two values which may be numeric
or character based. Click
the play button to run your model and see how this conversion has taken place in our data
set.
In results perspective, you
should see the transformation, as depicted in Figure 5-5.
Chapter 5: Association Rules
83
10)
In results perspective, we see that some of our attributes appear to have some frequent
patterns in them, and in fact, we begin to see that three attributes look like they might have
some association with one another. The black arrows point to areas where it seems that
Religious organizations might have some natural connections with Family and Hobby
organizations. We can investigate this possible connection further by adding one final
operator to our model. Return to design perspective, and in the operators search box, look
for ‘Create Association’ (again, without the single quotes). Drag the Create Association
Rules operator over and drop it into the spline that connects the
fre port to the
res port.
This operator takes in frequent pattern matrix data and seeks out
any patterns that occur so
frequently that they could be considered rules. Your model should now look like Figure 5-
8.
Figure 5-8. Addition of Create Association Rules operator.
11)
The Create Association Rules operator can generate both a set of rules (through the
rul
port) and a set of associated items (through the
ite port). We will simply generate rules, and
for now, accept the default parameters for the Create Association Rules, though note the
min confidence parameter, which we will address in the evaluation phase of our mining. Run
your model.
Figure 5-9. The results of our association rule model.