Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The PROC PRINT procedure lists the first 10 observations in the SAMPSIO.ASSOCS
data set.
proc print data=sampsio.assocs(obs=10);
title 'Partial Listing of the ASSOCS Data Set';
run;
Before you can run the ASSOCIATION and SEQUENCE procedures, you must
create the DMDB data set and the DMDB catalog by using a PROC DMBD step.
proc dmdb batch data=sampsio.assocs out=dmseq dmdbcat=catseq;
id customer time;
class product(desc);
run;
The ASSOCIATION procedure determines the products that are related.
The DATA= and DMDB= options identify the DMDB data set and catalog, respectively.
PROC ASSOC writes the related products to the OUT= data set; this data set
is used as input by the SEQUENCE procedure.
proc assoc data=dmseq dmdbcat=catseq
out=aout(label='Output from Proc Assoc')
The ITEMS= option specifies the maximum size of the item set to be considered
(default=4). The SUPPORT= option specifies the minimum support level that
is required for a rule to be accepted (default =5% of the largest frequency).
items=5 support=20;
The CUST statement (alias = CUSTOMER) specifies the ID variable. The
TARGET statement specifies the nominal target variable.
cust customer;
target product;
run;
The DATA= and DMDB= options identify the DMDB data set and catalog,
respectively. The ASSOC= option identifies the name of the input data set
from the previous PROC ASSOC run.
proc sequence data=dmseq dmdbcat=catseq
assoc=aout
out=sout(label='Output from Proc Sequence')
The NITEMS= option specifies the maximum number of events for which
rules, or chains, are generated. By default, the SEQUENCE procedure computes
binary sequences (NITEMS=2).
nitems=2;
The CUST statement (alias = CUSTOMER) specifies the ID variable. The
TARGET statement specifies the nominal target variable.
cust customer;
target product;
The VISIT statement names the timing or sequence variable.
visit time;
run;
The SORT procedure sorts the observations in descending order by the
values of support.
proc sort data=sout;
by descending support;
run;
The PRINT procedure lists the first 10 observations in the sorted sequence
data set.
proc print data=sout(obs=10);
var count support conf rule;
title 'Partial Listing of the 2-Item Sequences';
run;
The SEQUENCE Procedure
Example 2: Specifying the Maximum Number of Item
Events and Setting the Lower Timing Limit
SEQUENCE Procedure
Using the NITEMS= option to Specify the Maximum Number of Event Items
q
Using the SAME= option to Set the Lower Timing Limit
q
This example demonstrates how to specify the maximum number of item events and how to set the lower timing
limit of a sequence rule. Before you run the example program, you should submit the PROC DMDB and PROC
ASSOC steps from Example 1.
proc sequence data=dmseq
dmdbcat=catseq
assoc=aout
out=s4out(label = 'Output from Proc Sequence')
nitems=4;
cust customer;
target product;
visit time / same=2;
run;
proc sort data=s4out;
by descending support;
run;
proc print data=s4out(obs=10);
var count support conf rule;
title 'Partial Listing of the 4-Item Sequences';
title2 'Lower Timing Limit Set to 2';
run;
Output
Partial PROC PRINT Listing of the 4-Item Sequence Data Set, Lower Time
Set to 2
When the lower time limit is set to 2, the rule with the highest support is now a herring purchase followed by a
heineken purchase. Twenty-three percent of the customer population supports it, with a 48% confidence.
Partial Listing of the 4-Item Sequences
Lower Timing Limit Set to 2
OBS COUNT SUPPORT CONF RULE
1 235 23.4765 48.3539 hering ==> heineken
2 225 22.4775 57.3980 baguette ==> heineken
3 220 21.9780 69.1824 soda ==> cracker
4 218 21.7782 68.5535 soda ==> heineken & cracker
5 218 21.7782 68.5535 soda ==> heineken
6 215 21.4785 45.4545 olives ==> turkey
7 213 21.2787 52.8536 bourbon ==> cracker
8 209 20.8791 100.0000 hering & baguette ==> heineken
9 201 20.0799 55.3719 avocado ==> heineken
10 150 14.9850 30.8642 hering ==> cracker
Partial Log Listing
1 proc sequence data=dmseq
2 dmdbcat=catseq
3 assoc=aout
4 out=s4out(label = 'Output from Proc Sequence')
5
6 nitems=4;
7 cust customer;
8 target product;
9
10 visit time / same=2;
11 run;
Large itemsets: 1206
Total records read: 7007
Customer count: 1001
Support set to: 20
Total Litem Sequences: 5641
Number >= support 466
--- Number Items: 3 ---
Total records read: 7007
Customer count: 1001
Total Litem Sequences: 5086
Number >= support 12
--- Number Items: 4 ---
Total records read: 7007
Customer count: 1001
Total Litem Sequences: 0
Dostları ilə paylaş: |