The arboretum procedure

Yüklə 3,07 Mb.

Pdf görüntüsü

səhifə	64/148
tarix	30.04.2018
ölçüsü	3,07 Mb.
	#40673

1 ... 60 61 62 63 64 65 66 67 ... 148

option. In the COST= option, you may not use abbreviated variable lists such as D1-D3,

ABC--XYZ, or PQR:.

Default:

All costs are assumed to be 0.

CAUTION:

The COST= option may only be specified when the DECDATA= data set is of

TYPE=REVENUE.

PRIORVAR=variable

Specifies the variable in the DECDATA= data set that contains the prior probabilities to use for

making decisions.

Default:

None

The DMREG Procedure

FREQ Statement

Specifies the variable that contains frequencies for training data.

FREQ <variable> ;

variable

Specifies the frequency variable. If specified, the FREQ variable overrides whatever is in the

DMDB metadata. If the FREQ statement contains no name, then a FREQ variable is not used.

CAUTION:

If there is a frequency variable in the DMDB, it is not advisable to use another

variable as a frequency variable because the training data does not contain

observations with invalid values in the FREQ variable specified in the DMDB. For

example, if the frequency variable specified in the DMDB contains a 0 or negative

value, then that observation is discarded even if the FREQ variable that you specified

in the FREQ statement of the DMREG procedure contains valid frequency values.

Default:

If the FREQ statement is not specified, the frequency variable in the DMDB

is used. If the FREQ statement is specified without a variable, a frequency of

1 is used for all observations.

Range:

The frequency variable can contain integer or non-integer values.

The DMREG Procedure

MODEL Statement

Specifies modeling options.

Requirements: Model statement is required.

MODEL dependent=independent(s) / model-option(s);

Required Argument

dependent=independent(s)

where the arguments are defined as follows:

dependent

Specifies the response variable (target).

independents

Specifies the explanatory variables or effects (inputs). The syntax of effects is described in .

Options

model-options(s)

Specifies options that affect the fit, confidence intervals, variable selection, and specification of

the model as follows:

MODEL Options - Fitting Options

MISCCONV=n

Specifies the critical misclassification rate at which to stop iterations.

Default:

n = 0

Range:

0 - 1

STARTMISC=n

Specifies the number of iterations to be processed before checking misclassification rate.

Default:

Depends on the optimization technique:

n = 3

TECHNIQUE = NEWRAP, NRRIDG, TRUREG

n = 5

TECHNIQUE = QUANEW, DBLDOG

n = 10

TECHNIQUE = CONGRA

Alias:

STATMISC

MODEL Options - Miscellaneous Options

ALPHA=n

Specifies the significance level of confidence intervals for regression parameters.

Default:

.05

CLPARM

Specifies the computation of confidence intervals for parameters.

CORRB

Specifies that the correlation matrix is to be printed.

COVB

Specifies that the covariance matrix is to be printed.

Specifies the criterion for the selection of the model.

AIC

Represents the Akaike Information Criterion. The model with the smallest criterion value is

chosen.

NONE

Chooses standard variable selection based on the entry and/or stay P-values.

SBC

Represents the Schwarz Bayesian Criterion. The model with the smallest criterion value is

chosen.

TDECDATA

Represents the total profit/loss for the training data. The model with the largest profit or the

smallest loss is chosen.

VDECDATA

Represents the total profit/loss for the VALIDATA= data set. The model with the largest

profit or the smallest loss is chosen.

VERROR

Represents the error rate for the VALIDATA= data set. The error is the sum of square

errors for least-square regression and negative log-likelihood for logistic regression. The

model with the smallest error rate is chosen.

VMISC

Represents the misclassification rate for the VALIDATA= data set. The model with the

smallest misclassification rate is chosen.

XDECDATA

Represents the total profit/loss for cross-validation of the training data. The model with the

largest profit or the smallest loss is chosen.

XERROR

Represents the error rate for cross validation. The error is the sum of square errors for

least-square regression and negative log-likelihood for logistic regression. The model with

the smallest error rate is chosen.

XMISC

Represents the misclassification rate for cross validation. The model with the smallest

misclassification rate is chosen.

Default:

If decision processing is specified, the default is CHOOSE=TDECDATA; if

the VALIDATA= data set is also specified, the default is

CHOOSE=VDECDATA.

DETAILS

Prints details at each model selection step.

HIERARCHY=ALL | CLASS

Specifies how containment is to be applied.

ALL

Specifies that all independent variables that meet hierarchical requirements are included in

the model.

CLASS

Specifies that only CLASS variables that meet hierarchical requirements are included in the

model.

Default:

ALL

INCLUDE=n

Specifies that the first n effects in the model are to be included in each model.

Default:

Yüklə 3,07 Mb.

Dostları ilə paylaş:

1 ... 60 61 62 63 64 65 66 67 ... 148