The SEQUENCE Procedure
The SEQUENCE Procedure
Overview
Procedure Syntax
PROC SEQUENCE Statement
CUSTOMER Statement
TARGET Statement
VISIT Statement
Details
Examples
Example 1: Performing a Simple 2-Item Sequence Discovery
Example 2: Specifying the Maximum Number of Item Events and Setting the Lower Timing Limit
References
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The SEQUENCE Procedure
Overview
The SEQUENCE procedure enables you to perform sequence discovery. Sequence discovery goes one
step further than association discovery by taking into account the ordering or timing of the relationship
among items, for example, "Of those customers who purchase a new computer, 25% of them will
purchase a laser printer in the next quarter". To perform a sequence discovery, you must first run the
ASSOCIATION procedure to create and output the data set of the assembled items.
PROC SEQ produces rules similar to PROC RULEGEN, however the rules additionally imply an
element of timing. A rule A==>B implies that event B occurred 'after' event A occurred. The visit or
sequence variable is used for timing comparison. The sequence variable can have any numeric value,
including date or time values. Transactions with missing sequence values are ignored entirely during the
sequence computation.
In order to determine the timing element, SEQUENCE utilizes a sequence variable or time-stamp that
enables you to measure the time span from observation to observation. This procedure is useful for
businesses such as banks or mail-order houses.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The SEQUENCE Procedure
Procedure Syntax
PROC SEQUENCE < option(s)>;
CUSTOMER variable(s);
TARGET variable;
VISIT variable /< visit-option(s)>;
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The SEQUENCE Procedure
PROC SEQUENCE Statement
Invokes the SEQUENCE procedure.
PROC SEQUENCE < option(s)>;
Required Arguments
ASSOC= SAS-data-set
Specifies the SAS data set that was output from PROC ASSOC and which is also one of the inputs
to PROC SEQ.
DATA= SAS-data-set
Specifies the input data source in its DMDB form. This data set is read in order to extract the
timing information necessary to generate the sequence rules.
DMDBCAT= SAS-catalog
Identifies the metadata catalog associated with the input DMDB.
Options
NITEMS=integer
Specifies the maximum number of events for which rules, or chains, are generated. If you request
more than 2-event chains, (integer- 2) additional passes through the input file are required.
Default:
2
OUT= SAS-data-set
Specifies the output data set to which the rules are written. The output data set has the following
variables: RULE, COUNT, SUPPORT, CONF, ISET1, ISET2, ..., ISETn.
RULE
Contains the rule text, for example, A & B ==> C & D
COUNT
Contains the number of the transactions meeting the rule.
SUPPORT
Contains the percent of support, that is the percent of the total number of transactions that
qualify for the rule.
Definition:
SUPPORT= COUNT/total, where total is the total number of
transactions in the data set. The support level is an integer that
represents how frequently the combination occurs in the database.
CONF
Contains the percent of confidence.
Definition:
CONF= COUNT/lhs_count where lhs_count is the number of
transactions satisfying the left side of the rule.
ISET1, ISET2,..., ISETn
Contain, in order, the events that form the event chain. PROC SEQUENCE can detect
multiple events occurring at the same time and can report them as rules of the type A & B
==> C & D. This means that events A and B occurred at the same time, followed by C and
D, which occurred simultaneously afterwards.
SUPPORT=integer
Specifies the minimum number of transactions that must be considered in order for a rule to be
accepted. Rules that do not meet the support level are rejected.
Default:
If not specified, SUPPORT is set to a number that is 2% of the total
transaction count.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The SEQUENCE Procedure
CUSTOMER Statement
Specifies the ID variable that identifies each customer to be analyzed.
Alias: CUST
CUSTOMER variable(s);
Required Argument
variable(s)
Specifies the customer to be analyzed.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Dostları ilə paylaş: |