9
10 items=5 support=20;
11
12 cust customer;
13 target product;
14 run;
----- Potential 1 item sets = 20 -----
Counting items, records read: 7007
Number of customers: 1001
Support level for item sets: 20
Maximum count for a set: 600
Sets meeting support level: 20
Megs of memory used: 0.51
----- Potential 2 item sets = 190 -----
Counting items, records read: 7007
Maximum count for a set: 366
Sets meeting support level: 183
Megs of memory used: 0.51
----- Potential 3 item sets = 1035 -----
Counting items, records read: 7007
Maximum count for a set: 234
Sets meeting support level: 615
Megs of memory used: 0.51
----- Potential 4 item sets = 1071 -----
Counting items, records read: 7007
Maximum count for a set: 137
Sets meeting support level: 317
Megs of memory used: 0.51
----- Potential 5 item sets = 85 -----
Counting items, records read: 7007
Maximum count for a set: 116
Sets meeting support level: 71
Megs of memory used: 0.51
NOTE: The PROCEDURE ASSOC used 0:00:07.86 real 0:00:03.45 cpu.
15
16 proc rulegen in=datassoc
17 out=datrule(label='Output from Proc Rulegen')
18
19 minconf=75;
20 run;
write set 1
write set 2
write set 3
write set 4
write set 5
NOTE: The PROCEDURE RULEGEN used 0:00:06.07 real 0:00:02.69 cpu.
21
22 proc sort data=datrule;
23 by descending lift;
24 run;
NOTE: The data set WORK.DATRULE has 939 observations and 15 variables.
NOTE: The PROCEDURE SORT used 0:00:00.92 real 0:00:00.31 cpu.
25 proc print data=datrule(obs=5) label;
26 var set_size exp_conf conf support lift count
27 rule _lhand _rhand;
28 title 'Top Ten Rules based on Lift';
29 run;
NOTE: The PROCEDURE PRINT used 0:00:00.18 real 0:00:00.11 cpu.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Before you can run PROC ASSOC, you must create the DMDB data set and
the DMDB catalog by using a PROC DMBD step.
proc dmdb batch data=sampsio.assocs out=dmassoc dmdbcat=catassoc;
id customer;
class product(desc);
run;
The ASSOCIATION procedure determines the products that are related.
The DATA= and DMDB= options identify the DMDB data set and catalog, respectively.
PROC ASSOC writes the related products to the OUT= data set, which is used
as input by the RULEGEN procedure.
proc assoc data=dmassoc dmdbcat=catassoc
out=datassoc(label='Output from Proc Assoc')
The ITEMS= option specifies the maximum size of the item set to be considered
(default=4). The SUPPORT= option specifies the minimum support level that
is required for a rule to be accepted (default =5% of the largest frequency).
items=5 support=20;
The CUST statement (alias = CUSTOMER) specifies the ID variable. The
TARGET statement specifies the nominal target variable.
cust customer;
target product;
run;
The RULEGEN procedure uses the output from PROC ASSOC to generate the
rules. The rules are written to the OUT=data set.
proc rulegen in=datassoc
out=datrule(label='Output from Proc Rulegen')
The MINCONF= option specifies the minimum confidence required in order
to generate a rule (default =10).
minconf=75;
run;
Because neither PROC ASSOC nor RULEGEN generates printed output, the
remaining code sorts the data by the LIFT values and then generates a simple
list report of the rules that have the top 10 values for LIFT. This is done
primarily to limit the amount of output displayed in this example.
proc sort data=datrule;
by descending lift;
run;
proc print data=datrule(obs=5) label;
var set_size exp_conf conf support lift count
rule _lhand _rhand;
title 'Top Ten Rules based on Lift';run;
The PROC PRINT list report of the top 10 rules based on the LIFT value. The output data set from PROC
RULEGEN contains the following variables:
SET_SIZE - contains the number of items in the rule.
q
EXP_CONF - the expected confidence (right side count/total).
q
CONF - the confidence (count / left side).
q
SUPPORT - the support level (count/total).
q
LIFT - the lift ratio (confidence/expected confidence).
q
COUNT - number of transactions meeting the rule.
q
RULE - contains the text rule, for example, Right side ==> Left side.
q
_LHAND - contains the left side of the rule.
q
_RHAND - contains the right side of the rule.
q
ITEM1, ITEM2, .... ITEMn+1 - contains the individual items forming the rule, including the arrow. For this
example, the individual items have been omitted from the list report.
q
Top Ten Rules based on Lift 1
OBS SET_SIZE EXP_CONF CONF SUPPORT LIFT COUNT
1 1 7.39 100.00 7.39 13.53 74.00
2 5 12.59 94.74 8.99 7.53 90.00
3 5 10.79 78.99 9.39 7.32 94.00
4 5 11.89 87.04 9.39 7.32 94.00
5 5 12.69 92.78 8.99 7.31 90.00
OBS RULE
1 bordeaux
2 sardines & baguette & apples ==> peppers & avocado
3 turkey & coke ==> olives & ice_cream & bourbon
4 olives & ice_crea & bourbon ==> turkey & coke
5 peppers & baguette & apples ==> sardines & avocado
OBS _LHAND _RHAND
1
2 sardines & baguette & apple peppers & avocado
3 turkey & coke olives & ice_crea & bourbon
4 olives & ice_crea & bourbon turkey & coke
5 peppers & baguette & apple sardines & avocado
Dostları ilə paylaş: |