The arboretum procedure



Yüklə 3.07 Mb.

səhifə116/148
tarix30.04.2018
ölçüsü3.07 Mb.
1   ...   112   113   114   115   116   117   118   119   ...   148

The SEQUENCE Procedure

The SEQUENCE Procedure

Overview

Procedure Syntax

PROC SEQUENCE Statement

CUSTOMER Statement

TARGET Statement

VISIT Statement

Details

Examples

Example 1: Performing a Simple 2-Item Sequence Discovery

Example 2: Specifying the Maximum Number of Item Events and Setting the Lower Timing Limit

References

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.




The SEQUENCE Procedure

Overview

The SEQUENCE procedure enables you to perform sequence discovery. Sequence discovery goes one

step further than association discovery by taking into account the ordering or timing of the relationship

among items, for example, "Of those customers who purchase a new computer, 25% of them will

purchase a laser printer in the next quarter". To perform a sequence discovery, you must first run the

ASSOCIATION procedure to create and output the data set of the assembled items.

PROC SEQ produces rules similar to PROC RULEGEN, however the rules additionally imply an

element of timing. A rule A==>B implies that event B occurred 'after' event A occurred. The visit or

sequence variable is used for timing comparison. The sequence variable can have any numeric value,

including date or time values. Transactions with missing sequence values are ignored entirely during the

sequence computation.

In order to determine the timing element, SEQUENCE utilizes a sequence variable or time-stamp that

enables you to measure the time span from observation to observation. This procedure is useful for

businesses such as banks or mail-order houses.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



The SEQUENCE Procedure

Procedure Syntax

PROC SEQUENCE <option(s)>;

CUSTOMER variable(s);

TARGET variable;

VISIT variable /<visit-option(s)>;

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.




The SEQUENCE Procedure

PROC SEQUENCE Statement

Invokes the SEQUENCE procedure.

PROC SEQUENCE <option(s)>;

Required Arguments

ASSOC= SAS-data-set

Specifies the SAS data set that was output from PROC ASSOC and which is also one of the inputs

to PROC SEQ.

DATA= SAS-data-set

Specifies the input data source in its DMDB form. This data set is read in order to extract the

timing information necessary to generate the sequence rules.

DMDBCAT= SAS-catalog

Identifies the metadata catalog associated with the input DMDB.



Options

NITEMS=integer

Specifies the maximum number of events for which rules, or chains, are generated. If you request

more than 2-event chains, (integer- 2) additional passes through the input file are required.

Default:

2

OUT= SAS-data-set

Specifies the output data set to which the rules are written. The output data set has the following

variables: RULE, COUNT, SUPPORT, CONF, ISET1, ISET2, ..., ISETn.

RULE

Contains the rule text, for example, A & B ==> C & D



COUNT

Contains the number of the transactions meeting the rule.

SUPPORT

Contains the percent of support, that is the percent of the total number of transactions that



qualify for the rule.


Definition:

SUPPORT= COUNT/total, where total is the total number of

transactions in the data set. The support level is an integer that

represents how frequently the combination occurs in the database.

CONF

Contains the percent of confidence.



Definition:

CONF= COUNT/lhs_count where lhs_count is the number of

transactions satisfying the left side of the rule.

ISET1, ISET2,..., ISETn

Contain, in order, the events that form the event chain. PROC SEQUENCE can detect

multiple events occurring at the same time and can report them as rules of the type A & B

==> C & D. This means that events A and B occurred at the same time, followed by C and

D, which occurred simultaneously afterwards.



SUPPORT=integer

Specifies the minimum number of transactions that must be considered in order for a rule to be

accepted. Rules that do not meet the support level are rejected.

Default:

If not specified, SUPPORT is set to a number that is 2% of the total

transaction count.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.




The SEQUENCE Procedure

CUSTOMER Statement

Specifies the ID variable that identifies each customer to be analyzed.

Alias: CUST

CUSTOMER variable(s);

Required Argument

variable(s)

Specifies the customer to be analyzed.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.





Dostları ilə paylaş:
1   ...   112   113   114   115   116   117   118   119   ...   148


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə