The arboretum procedure

Yüklə 3,07 Mb.

Pdf görüntüsü

səhifə	32/148
tarix	30.04.2018
ölçüsü	3,07 Mb.
	#40673

1 ... 28 29 30 31 32 33 34 35 ... 148

The DECIDE Procedure

PROC DECIDE Statement

Invokes the DECIDE procedure.

Discussion: The PROC DECIDE statement runs the DECIDE procedure and identifies the input and

output data sets. You also need the following statements:

A DECISION statement

Either a POSTERIORS or a PREDICTED statement

A TARGET statement

PROC DECIDE<option(s)>;

Options

DATA= SAS-data-set

Specifies the input data set that contains the output from a modeling procedure.

Default:

_LAST_

OUT= SAS-data-set

Specifies the output data set that contains the following variables:

the variables from the input data set

the chosen decision (prefix D_)

the expected consequence of the chosen decision (prefix EL_ or EP_)

If the target value is in the input data set, the output data set also contains the following variables:

the consequence of the chosen decision computed from the target value (prefix CL_ or CP_)

the consequence of the best possible decision knowing the target value (prefix BL_ or BP_)

If PRIORVAR= and OLDPRIORVAR= variables are specified, then the output data will contain

the recalculated posteriors.

Note: If you want to create a permanent SAS data set, you must specify a two-level name. For

more information on this topic, see "SAS Files" and "DATA Step Concepts " in SAS Language

Reference: Concepts.

Default:

If the OUT= option is omitted, PROC DECIDE creates an output data set and

names it according to the DATAn convention, just as if you had omitted a

data set name in a DATA statement.

OUTFIT= SAS-data-set

Specifies the output data set that contains statistics including the total and average profit or loss.

The OUTFIT= option may not be specified with ROLE=SCORE.

Default:

None

ROLE=TRAIN|VALID|VALIDATION|TEST|SCORE

Specifies whether the DATA= data set is a training set, validation set, test set, or scoring set. The

ROLE= option affects the names of the variables in the OUTFIT= data set.

Default:

TEST

The DECIDE Procedure

CODE Statement

Generates SAS DATA step code that can be used to score data sets.

Tip: If neither FILE= nor METABASE= are specified, then the SAS code is written to the SAS log.

You may specify both FILE= and METABASE= to write code to both locations. The TARGET

variable must appear in the DATA= data set as well as the DECDATA= data set.

CODE <code-option(s)>;

Code Options

FILE='filename'

Specifies a path for writing the code to an external file. For example:

FILE="c:\mydir\scorecode.sas".

Default:

None

FORMAT=format

Specifies the numeric format to be used when printing numeric constants. For example,

FORMAT=BEST20.

Default:

FORMAT=BEST12

GROUP= group-name

Specifies the group identifier (up to four characters) for group processing.

Default:

GROUP=_

METABASE=screenspec

Specifies a catalog entry to which the code is written. For example, METABASE=

mylib.mycat.myentry.

Default:

None

RESIDUAL

Specifies that variables that depend on the target variable, such as the BL_, BP_, CL_, and CP_

variables, are to be computed in the code.

The DECIDE Procedure

DECISION Statement

Specifies information used for decision processing in the DECIDE, DMREG, NEURAL, and

SPLIT procedures. This documentation applies to all four procedures.

Tip: The DECISION statement is required in the DECIDE procedure, but not in the DMREG,

NEURAL, or SPLIT procedures.

DECISION DECDATA= SAS-data-set <option(s)>;

DECDATA= SAS-data-set

Specifies the input data set that contains the decision matrix and/or prior probabilities. The

DECDATA= data set must also contain the target variable.

The DECDATA= data set may contain decision variables specified by means of the DECVARS=

option, or prior probability variable(s) specified by means of the PRIORVAR= option and/or the

OLDPRIORVAR= option, or both.

The target variable is specified by means of the TARGET statement in the DECIDE, NEURAL,

and SPLIT procedures or by using the MODEL statement in the DMREG procedure.

For a categorical target variable, there should be one row for each class. The value in the decision

matrix located at a given row and column specifies the consequence of making the decision

corresponding to column when the target class corresponds to the row. If any class appears twice

or more in the DECDATA= data set, an error message is printed and the procedure terminates. For

the DMREG, NEURAL, and SPLIT procedures, all class values in the training set must also

appear in the DECDATA= data set, but it is allowed to have class values in the DECDATA= data

set that are not in the training set. For the DECIDE procedure, any class value in the DATA= data

set that is not found in the DECDATA= data set is treated in the same way as a missing class

value; it is allowed to have class values in the DECDATA= data set that are not in the DATA=

data set, but note that the classes in the DECDATA= data set must correspond exactly with the

variables in the POSTERIORS statement.

For an interval target variable, each row defines a knot in a piecewise linear spline function. The

consequence of making a decision is computed by interpolating in the corresponding column of

the decision matrix. If the predicted target value is outside the range of knots in the decision

matrix, the consequence of a decision is computed by linear extrapolation. If the target values are

in nondecreasing or nonincreasing order, any interior target value is allowed to appear twice in the

data set so you can specify discontinuities. The end points (that is, the minimum and maximum

target values in the data set) may not appear more than once. No target value is allowed to appear

more than twice. If the target values are not in nondecreasing or nonincreasing order, the target

Yüklə 3,07 Mb.

Dostları ilə paylaş:

1 ... 28 29 30 31 32 33 34 35 ... 148