The arboretum procedure



Yüklə 3.07 Mb.

səhifə119/148
tarix30.04.2018
ölçüsü3.07 Mb.
1   ...   115   116   117   118   119   120   121   122   ...   148
: documentation
documentation -> From cyber-crime to insider trading, digital investigators are increasingly being asked to
documentation -> EnCase Forensic Transform Your Investigations
documentation -> File Sharing Documentation Prepared by Alan Halter Created: 1/7/2016 Modified: 1/7/2016
documentation -> Gaia Data Release 1 Documentation release 0

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.


 

The PROC PRINT procedure lists the first 10 observations in the SAMPSIO.ASSOCS

data set.

proc print data=sampsio.assocs(obs=10);

   title 'Partial Listing of the ASSOCS Data Set';

run;



 

Before you can run the ASSOCIATION and SEQUENCE procedures, you must

create the DMDB data set and the DMDB catalog by using a PROC DMBD step.

proc dmdb batch data=sampsio.assocs out=dmseq dmdbcat=catseq;

   id customer time;

   class product(desc);

run;



 

The ASSOCIATION procedure determines the products that are related.

The DATA= and DMDB= options identify the DMDB data set and catalog, respectively.

PROC ASSOC writes the related products to the  OUT= data set; this data set

is used as input by the SEQUENCE procedure.

proc assoc data=dmseq dmdbcat=catseq 

   out=aout(label='Output from Proc Assoc')



 

The ITEMS= option specifies the maximum size of the item set to be considered

(default=4). The SUPPORT= option specifies the minimum support level that

is required for a rule to be accepted (default =5% of the largest frequency). 

   

    items=5 support=20;




 

The CUST statement (alias = CUSTOMER) specifies the ID variable. The

TARGET statement specifies the nominal target variable.

   cust customer;

   target product;

run;



 

The DATA= and DMDB= options identify the DMDB data set and catalog,

respectively.  The ASSOC= option identifies the name of the input data set

from the previous PROC ASSOC run.

proc sequence data=dmseq dmdbcat=catseq

              assoc=aout

              out=sout(label='Output from Proc Sequence')                 



 

The NITEMS= option specifies the maximum number of events for which

rules, or chains, are  generated. By default, the SEQUENCE procedure computes

binary sequences (NITEMS=2). 

             nitems=2;



 

The CUST statement (alias = CUSTOMER) specifies the ID variable. The

TARGET statement specifies the nominal target variable.

   cust customer;

   target product;



 

The VISIT statement names the timing or sequence variable. 

   visit time;

run;



 

The SORT procedure sorts the observations in descending order by the

values of support.

proc sort data=sout;

   by descending support;

run;



 

The PRINT procedure lists the first 10 observations in the sorted sequence

data set.

proc print data=sout(obs=10);

   var count support conf rule;

   title 'Partial Listing of the 2-Item Sequences';

run;



The SEQUENCE Procedure

Example 2: Specifying the Maximum Number of Item

Events and Setting the Lower Timing Limit

SEQUENCE Procedure

Using the NITEMS= option to Specify the Maximum Number of Event Items

q   

Using the SAME= option to Set the Lower Timing Limit



q   

This example demonstrates how to specify the maximum number of item events and how to set the lower timing

limit of a sequence rule. Before you run the example program, you should submit the PROC DMDB and PROC

ASSOC steps from Example 1.

proc sequence data=dmseq

              dmdbcat=catseq

              assoc=aout

              out=s4out(label = 'Output from Proc Sequence')

 

             nitems=4;



   cust customer;

   target product;

 

   visit time / same=2;



run;

 

proc sort data=s4out;



   by descending support;

run;


 

proc print data=s4out(obs=10);

   var count support conf rule;

   title 'Partial Listing of the 4-Item Sequences';

   title2 'Lower Timing Limit Set to 2';

run;


Output


Partial PROC PRINT Listing of the 4-Item Sequence Data Set, Lower Time

Set to 2

When the lower time limit is set to 2, the rule with the highest support is now a herring purchase followed by a

heineken purchase. Twenty-three percent of the customer population supports it, with a 48% confidence.

           Partial Listing of the 4-Item Sequences

                  Lower Timing Limit Set to 2

   OBS       COUNT     SUPPORT        CONF    RULE

     1         235     23.4765     48.3539    hering ==> heineken           

     2         225     22.4775     57.3980    baguette ==> heineken         

     3         220     21.9780     69.1824    soda ==> cracker              

     4         218     21.7782     68.5535    soda ==> heineken & cracker   

     5         218     21.7782     68.5535    soda ==> heineken             

     6         215     21.4785     45.4545    olives ==> turkey             

     7         213     21.2787     52.8536    bourbon ==> cracker           

     8         209     20.8791    100.0000    hering & baguette ==> heineken

     9         201     20.0799     55.3719    avocado ==> heineken          

    10         150     14.9850     30.8642    hering ==> cracker          



Partial Log Listing

1  proc sequence data=dmseq

2                dmdbcat=catseq

3                assoc=aout

4                out=s4out(label = 'Output from Proc Sequence')

5  


6               nitems=4;

7     cust customer;

8     target product;

9  


10     visit time / same=2;

11  run;


Large itemsets:            1206

Total records read:        7007

Customer count:            1001

Support set to:              20

Total Litem Sequences:     5641

Number >= support           466

--- Number Items:         3 ---

Total records read:        7007

Customer count:            1001

Total Litem Sequences:     5086

Number >= support            12

--- Number Items:         4 ---

Total records read:        7007

Customer count:            1001

Total Litem Sequences:        0




Dostları ilə paylaş:
1   ...   115   116   117   118   119   120   121   122   ...   148


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə