The arboretum procedure



Yüklə 3.07 Mb.

səhifə117/148
tarix30.04.2018
ölçüsü3.07 Mb.
1   ...   113   114   115   116   117   118   119   120   ...   148
: documentation
documentation -> From cyber-crime to insider trading, digital investigators are increasingly being asked to
documentation -> EnCase Forensic Transform Your Investigations
documentation -> File Sharing Documentation Prepared by Alan Halter Created: 1/7/2016 Modified: 1/7/2016
documentation -> Gaia Data Release 1 Documentation release 0

The SEQUENCE Procedure

TARGET Statement

Specifies the name of the product to be analyzed.

TARGET variable;

Required Argument

variable(s)

Specifies the name of the product to be analyzed.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



The SEQUENCE Procedure

VISIT Statement

VISIT variable visit-option(s)>;

Required Argument

Specifies the timing variable. See 

Details

 for an example of the SAME and WINDOW options in the



VISIT statement.

variable

Specifies the time-stamp unit to measure. Variable is any numeric value, including date or time

values.

Options

visit-option

SAME and WINDOW specify the upper and lower timing limits of a sequence rule.SAME < time

difference   WINDOW.

Visit-option can be as follows:

SAME=same-number

Specifies the lower time-limit between the occurrence of two events that you want to

associate with each other. If the time difference between the two events is less than or equal

to same-number (that is, it is 'too soon'), then the two events are treated as occurring in the

same visit, and the transaction is not counted.



Default:

0

WINDOW=window-number



Specifies the maximum time difference between the occurrence of two events that you want

to be treated as the same visit. If the time difference is greater than window-number (that is,

it is 'too late'), then the transaction is treated as falling outside of the timing window, and the

transaction is not counted. For NITEM-long sequence chain, WINDOW applies to the entire

chain.

Default:

MAX


Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.


The SEQUENCE Procedure

Details

SAME and WINDOW Parameters

Two optional parameters, SAME and WINDOW, are available to define what is 'after'. The rule A==>B

implies SAME < TimeB - TimeA   WINDOW.

Any time difference (TimeB - TimeA) less than or equal to SAME is considered the same time and is

consolidated as the same visit and the same transaction. Any time difference exceeding WINDOW falls

outside of the timing window and is ignored as well. In other words, SAME lets you define what is 'too

soon', that is, event B occurred too soon after event A to qualify for A==>B. Likewise, WINDOW

defines 'too late', that is, event B occurred too late after event A occurred to be considered for A==>B.

Consider the following example:

Customer

Visit

Product

1

1



soda

1

2



apples

1

3



juice

1

4



milk

1

5



bread

2

2



soda

2

6



apples

2

7



milk

With SAME=1, the visits are consolidated as follows:



Customer

Visit

Product

1

1



soda

and


apples


1

3

juice



and

milk


1

5

bread



2

2

soda



2

6

apples



and

milk


Customer 1 is counted for apples ==> milk, however, Customer 2 is not. Both customers are counted for

soda ==> milk.

If WINDOW=3 was also specified, then only Customer 1 would count for soda ==> milk. Using the

above criterion, Customer 2 would not qualify.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



The SEQUENCE Procedure

Examples

The following examples were executed using the HP-UX version 10.20 operating system and the SAS

software release 6.12TS045.

Example 1: Performing a Simple 2-Item Sequence Discovery

Example 2: Specifying the Maximum Number of Item Events and Setting the Lower

Timing Limit

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.




The SEQUENCE Procedure

Example 1: Performing a Simple 2-Item Sequence

Discovery

Features: ASSOCIATION and SEQUENCE Procedures

Specifying the Maximum Item-Set Size

q   

Setting the Support Level



q   

Specifying the Number of Events

q   

The following example demonstrates how to perform a sequence discovery using the ASSOCIATION and SEQUENCE



procedures. The example data set SAMPSIO. ASSOCS (stored in the sample library) contains 7,007 separate customer

transactions. CUSTOMER is an ID variable that identifies the customers. PRODUCT is the nominal target variable that

identifies the items. TIME is the visit variable that measures the time span from observation to observation.

As a marketing analyst for a grocery chain, you want to identify likely 2-item purchase sequences. This information may

help you make decisions, such as when to distribute coupons, when to put a product on sale, or how to present items in

store displays.



Program

 

proc print data=sampsio.assocs(obs=10);



   title 'Partial Listing of the ASSOCS Data Set';

run;


 

proc dmdb batch data=sampsio.assocs out=dmseq dmdbcat=catseq;

   id customer time;

   class product(desc);

run;

 

proc assoc data=dmseq dmdbcat=catseq 



   out=aout(label='Output from Proc Assoc')

 

   



    items=5 support=20;

 

   cust customer;



   target product;

run;




Dostları ilə paylaş:
1   ...   113   114   115   116   117   118   119   120   ...   148


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə