The arboretum procedure

Yüklə 3,07 Mb.

Pdf görüntüsü

səhifə	31/148
tarix	30.04.2018
ölçüsü	3,07 Mb.
	#40673

1 ... 27 28 29 30 31 32 33 34 ... 148

Proceedings, ACM SIGMOID Conference on Management of Data
The DECIDE Procedure Overview Procedure Syntax
Procedure Syntax PROC DECIDE

The ASSOC Procedure

Example

PROC ASSOC must be executed before PROC RULEGEN or PROC SEQUENCE is run.

Please see the RULEGEN and SEQUENCE procedures syntax for examples of PROC ASSOC code.

The ASSOC Procedure

References

Agrawal, R., Imielinski, T., and Swami, A. (1993), "Mining Association Rules between Sets of

Items in Large Databases", Proceedings, ACM SIGMOID Conference on Management of Data,

207-216, Washington, D. C.

Berry, M. J. A. and Linoff, G. (1997), Data Mining Techniques for Marketing, Sales, and

Customer Support, New York: John Wiley and Sons, Inc.

The DECIDE Procedure

The DECIDE Procedure

Overview

Procedure Syntax

PROC DECIDE Statement

CODE Statement

DECISION Statement

FREQ Statement

POSTERIORS Statement

PREDICTED Statement

TARGET Statement

Details

Example

Example 1: Using the DECIDE Procedure Following the DISCRIM Procedure

References

The DECIDE Procedure

Overview

The DECIDE procedure produces optimal decisions based on a user-supplied decision matrix, prior

probabilities, and output from a modeling procedure. The output from the modeling procedure may be

either posterior probabilities for the classes of a categorical target variable or predicted values of an

interval target variable. The DECIDE procedure can also adjust posterior probabilities for changes in the

prior probabilities. Background and formulas for decision processing are given in the chapter on

"Predictive Modeling."

The decision matrix contains columns (decision variables) corresponding to each decision and rows

(observations) corresponding to target values. The values of the decision variables represent

target-specific consequences, which may be profit, loss, or revenue. These consequences are the same

for all cases being scored.

For a categorical target variable, there should be one row for each category. The value in the decision

matrix that is located at a given row and column specifies the consequence of making the decision

corresponding to column when the target value corresponds to the row.

For an interval target variable, each row defines a knot in a piecewise linear spline function. The

consequence of making a decision is computed by interpolating in the corresponding column of the

decision matrix. If the predicted target value is outside the range of knots in the decision matrix, the

consequence of a decision is computed by linear extrapolation.

For each decision, there may also be either a cost variable or a numeric constant. The values of these

variables represent case-specific consequences, which are always costs. These consequences do not

depend on the target values of the cases being scored. Costs are used for computing return on investment

as (revenue-cost)/cost.

Cost variables may be specified only if the decision data set contains revenue, not profit or loss.

Therefore, if revenues and costs are specified, profits are computed as revenue minus cost. If revenues

are specified without costs, the costs are assumed to be 0. The interpretation of consequences as profits,

losses, revenues, and costs is needed only to compute return on investment. You can specify values in

the decision data set that are target-specific consequences but that may have some practical interpretation

other than profit, loss, or revenue. Likewise, you can specify values for the cost variables that are

case-specific consequences but that may have some practical interpretation other than costs. If the

revenue/cost interpretation is not applicable, the values computed for return on investment may not be

meaningful.

The DECIDE procedure will choose the optimal decision for each observation. If the decision data set is

of TYPE=PROFIT or REVENUE, the decision that produces the maximum expected or estimated profit

is chosen. If the decision data set is of TYPE=LOSS the decision that produces the minimum expected or

estimated loss is chosen.

If the actual value of the target variable is known, the DECIDE procedure will calculate:

The consequence of the chosen decision for the actual target value for each case.

The best possible consequence for each case.

Summary statistics giving the total and average profit or loss.

Some modeling procedures assume that the prior probabilities for categorical variable level membership

are either all equal or proportional to the relative frequency of the corresponding response level in the

data set. PROC DECIDE allows you to specify other prior probabilities. Thus, you can conduct a

sensitivity analysis without rerunning the modeling procedure.

The DECIDE Procedure

Procedure Syntax

PROC DECIDE <option(s)>;

CODE <option(s)>;

DECISION DECDATA= SAS-data-set <option(s)>;

FREQ variable;

POSTERIORS variable(s);

PREDICTED variable;

TARGET variable;

Yüklə 3,07 Mb.

Dostları ilə paylaş:

1 ... 27 28 29 30 31 32 33 34 ... 148