The ASSOC Procedure
Example
PROC ASSOC must be executed before PROC RULEGEN or PROC SEQUENCE is run.
Please see the RULEGEN and SEQUENCE procedures syntax for examples of PROC ASSOC code.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The ASSOC Procedure
References
Agrawal, R., Imielinski, T., and Swami, A. (1993), "Mining Association Rules between Sets of
Items in Large Databases", Proceedings, ACM SIGMOID Conference on Management of Data,
207-216, Washington, D. C.
Berry, M. J. A. and Linoff, G. (1997), Data Mining Techniques for Marketing, Sales, and
Customer Support, New York: John Wiley and Sons, Inc.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
The DECIDE Procedure
Overview
Procedure Syntax
PROC DECIDE Statement
CODE Statement
DECISION Statement
FREQ Statement
POSTERIORS Statement
PREDICTED Statement
TARGET Statement
Details
Example
Example 1: Using the DECIDE Procedure Following the DISCRIM Procedure
References
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
Overview
The DECIDE procedure produces optimal decisions based on a user-supplied decision matrix, prior
probabilities, and output from a modeling procedure. The output from the modeling procedure may be
either posterior probabilities for the classes of a categorical target variable or predicted values of an
interval target variable. The DECIDE procedure can also adjust posterior probabilities for changes in the
prior probabilities. Background and formulas for decision processing are given in the chapter on
"Predictive Modeling."
The decision matrix contains columns (decision variables) corresponding to each decision and rows
(observations) corresponding to target values. The values of the decision variables represent
target-specific consequences, which may be profit, loss, or revenue. These consequences are the same
for all cases being scored.
For a categorical target variable, there should be one row for each category. The value in the decision
matrix that is located at a given row and column specifies the consequence of making the decision
corresponding to column when the target value corresponds to the row.
For an interval target variable, each row defines a knot in a piecewise linear spline function. The
consequence of making a decision is computed by interpolating in the corresponding column of the
decision matrix. If the predicted target value is outside the range of knots in the decision matrix, the
consequence of a decision is computed by linear extrapolation.
For each decision, there may also be either a cost variable or a numeric constant. The values of these
variables represent case-specific consequences, which are always costs. These consequences do not
depend on the target values of the cases being scored. Costs are used for computing return on investment
as (revenue-cost)/cost.
Cost variables may be specified only if the decision data set contains revenue, not profit or loss.
Therefore, if revenues and costs are specified, profits are computed as revenue minus cost. If revenues
are specified without costs, the costs are assumed to be 0. The interpretation of consequences as profits,
losses, revenues, and costs is needed only to compute return on investment. You can specify values in
the decision data set that are target-specific consequences but that may have some practical interpretation
other than profit, loss, or revenue. Likewise, you can specify values for the cost variables that are
case-specific consequences but that may have some practical interpretation other than costs. If the
revenue/cost interpretation is not applicable, the values computed for return on investment may not be
meaningful.
The DECIDE procedure will choose the optimal decision for each observation. If the decision data set is
of TYPE=PROFIT or REVENUE, the decision that produces the maximum expected or estimated profit
is chosen. If the decision data set is of TYPE=LOSS the decision that produces the minimum expected or
estimated loss is chosen.
If the actual value of the target variable is known, the DECIDE procedure will calculate:
The consequence of the chosen decision for the actual target value for each case.
q
The best possible consequence for each case.
q
Summary statistics giving the total and average profit or loss.
q
Some modeling procedures assume that the prior probabilities for categorical variable level membership
are either all equal or proportional to the relative frequency of the corresponding response level in the
data set. PROC DECIDE allows you to specify other prior probabilities. Thus, you can conduct a
sensitivity analysis without rerunning the modeling procedure.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
Procedure Syntax
PROC DECIDE <
option(s)>;
CODE <
option(s)>;
DECISION DECDATA= SAS-data-set <
option(s)>;
FREQ variable;
POSTERIORS variable(s);
PREDICTED variable;
TARGET variable;
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.