*The SEQUENCE Procedure*
**The SEQUENCE Procedure**
**Overview**
**Procedure Syntax**
PROC SEQUENCE Statement
CUSTOMER Statement
TARGET Statement
VISIT Statement
**Details**
**Examples**
Example 1: Performing a Simple 2-Item Sequence Discovery
Example 2: Specifying the Maximum Number of Item Events and Setting the Lower Timing Limit
**References**
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
*The SEQUENCE Procedure*
**Overview**
The SEQUENCE procedure enables you to perform sequence discovery. Sequence discovery goes one
step further than association discovery by taking into account the ordering or timing of the relationship
among items, for example, "Of those customers who purchase a new computer, 25% of them will
purchase a laser printer in the next quarter". To perform a sequence discovery, you must first run the
ASSOCIATION procedure to create and output the data set of the assembled items.
PROC SEQ produces rules similar to PROC RULEGEN, however the rules additionally imply an
element of timing. A rule A==>B implies that event B occurred 'after' event A occurred. The visit or
sequence variable is used for timing comparison. The sequence variable can have any numeric value,
including date or time values. Transactions with missing sequence values are ignored entirely during the
sequence computation.
In order to determine the timing element, SEQUENCE utilizes a sequence variable or time-stamp that
enables you to measure the time span from observation to observation. This procedure is useful for
businesses such as banks or mail-order houses.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
*The SEQUENCE Procedure*
**Procedure Syntax**
**PROC SEQUENCE **< *option(s)*>;
**CUSTOMER** *variable(s)*;
**TARGET** *variable*;
**VISIT** *variable* /< *visit-option(s)*>;
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
*The SEQUENCE Procedure*
**PROC SEQUENCE Statement**
**Invokes the SEQUENCE procedure.**
**PROC SEQUENCE **< *option(s)*>;
**Required Arguments**
**ASSOC=*** SAS-data-set*
Specifies the SAS data set that was output from PROC ASSOC and which is also one of the inputs
to PROC SEQ.
**DATA=*** SAS-data-set*
Specifies the input data source in its DMDB form. This data set is read in order to extract the
timing information necessary to generate the sequence rules.
**DMDBCAT=*** SAS-catalog*
Identifies the metadata catalog associated with the input DMDB.
**Options**
**NITEMS=***integer*
Specifies the maximum number of events for which rules, or chains, are generated. If you request
more than 2-event chains, (**integer**- 2) additional passes through the input file are required.
**Default:**
2
**OUT=*** SAS-data-set*
Specifies the output data set to which the rules are written. The output data set has the following
variables: RULE, COUNT, SUPPORT, CONF, ISET1, ISET2, ..., ISETn.
RULE
Contains the rule text, for example, A & B ==> C & D
COUNT
Contains the number of the transactions meeting the rule.
SUPPORT
Contains the percent of support, that is the percent of the total number of transactions that
qualify for the rule.
**Definition:**
SUPPORT= COUNT/total, where *total* is the total number of
transactions in the data set. The support level is an integer that
represents how frequently the combination occurs in the database.
CONF
Contains the percent of confidence.
**Definition:**
CONF= COUNT/lhs_count where *lhs_count *is the number of
transactions satisfying the left side of the rule.
ISET1, ISET2,..., ISETn
Contain, in order, the events that form the event chain. PROC SEQUENCE can detect
multiple events occurring at the same time and can report them as rules of the type A & B
==> C & D. This means that events A and B occurred at the same time, followed by C and
D, which occurred simultaneously afterwards.
**SUPPORT=***integer*
Specifies the minimum number of transactions that must be considered in order for a rule to be
accepted. Rules that do not meet the support level are rejected.
**Default:**
If not specified, SUPPORT is set to a number that is 2% of the total
transaction count.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
*The SEQUENCE Procedure*
**CUSTOMER Statement**
**Specifies the ID variable that identifies each customer to be analyzed.**
**Alias: **CUST
**CUSTOMER** *variable(s)*;
**Required Argument**
**variable(s)**
Specifies the customer to be analyzed.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
**Dostları ilə paylaş:** |