The arboretum procedure

Yüklə 3,07 Mb.

Pdf görüntüsü

səhifə	69/148
tarix	30.04.2018
ölçüsü	3,07 Mb.
	#40673

1 ... 65 66 67 68 69 70 71 72 ... 148

Chapter Contents

Next

The DMREG Procedure

Details

Input

The input to the DMREG procedure can be assigned one of these roles:

Training

The DATA= data set is used to fit the initial model.

Validation

The VALIDATA= data set is used to compute assessment statistics and to fine-tune the model

during stepwise selection.

Test

The TESTDATA= data set is an additional "hold out" data set that you can use to compute

assessment statistics.

Score

The DATA= data set in the SCORE statement is used for predicting target values for a new data

set that may not contain the target.

Specification of Effects

Different types of effects can be used in the DMREG procedure. In the following list, assume that A, B,

and C are class variables and that X1, X2, and Y are continuous variables:

Regressor effects are specified by writing continuous variables individually:

X1 X2

Polynomial effects are specified by joining two or more continuous variables with asterisks:

X1*X1 X1*X2

Main effects are specified by writing class variables individually:

Crossed effects (interactions) are specified by joining class variables with asterisks:

A*BB*CA*B*C

Continuous-by-class effects are written by joining continuous variables and class variables with

asterisks:

X1*A.

5.

Note: Nested effects are not supported.

Optimization Methods

The following table provides a list of the general nonlinear optimization methods and the default

maximum number of iterations and function calls for each method.

Optimization Methods for the Regression

node.

Optimization

Method

Maximum

Iterations

Maximum

Function

Calls

Conjugate

Gradient

400

1000

Double Dogleg

200

500

Newton-Raphson

with Line Search

125

Newton-Raphson

with Ridging

125

Quasi-Newton

200

500

Trust-Region

125

You should set the optimization method based on the size of the data mining problem, as follows:

Small-to-medium problems - The Trust-Region, Newton-Raphson with Ridging, and

Newton-Raphson with Line Search methods are appropriate for small and medium sized

optimization problems (number of model parameters up to 40) where the Hessian matrix is easy

and cheap to compute. Sometimes, Newton-Raphson with Ridging can be faster than

Trust-Region, but Trust-Region is numerically more stable. If the Hessian matrix is not singular at

the optimum, then the Newton-Raphson with Line Search can be a very competitive method.

Medium Problems - The quasi-Newton and Double Dogleg methods are appropriate for medium

optimization problems (number of model parameters up to 400) where the objective function and

the gradient are must faster to compute than the Hessian. Quasi-Newton and Double Dogleg

require more iterations than does the Trust-Region or the Newton-Raphson methods, but each

iteration is much faster.

Large Problems - The Conjugate Gradient method is appropriate for large data mining problems

(number of model parameters greater than 400) where the objective function and the gradient are

much faster to compute than the Hessian matrix, and where they need too much memory to store

the approximate Hessian matrix.

3.

Note: To learn about these optimization methods, see the SAS/OR Technical Report: The NLP

Procedure (1997).

The underlying "Default" optimization entry method depends on the number of parameters in the model.

If the number of parameters is less than or equal to 40, then the default method is set to Newton-Raphson

with Ridging. If the number of parameters is greater than 40 and less than 400, then the default method is

set to quasi-Newton. If the number of parameters is greater than 400, then Conjugate Gradient is the

default method.

Fit Statistics for OUTEST and OUTFIT Data Sets

The OUTEST= data set in the PROC DMREG statement contains fit statistics for the training, test,

and/or validation data. Depending on the ROLE= option in the SCORE statement, the OUTFIT= data set

contains fit statistics for either the training, test, or validation data.

Fit Statistics for the Training

Data

Fit

Statistic

Training Data

_AIC_

Train: Akaike's

Information

Criterion

_ASE_

Train: Average

Squared Error

_AVERR_ Train: Average

Error Function

_DFE_

Train: Degrees

of Freedom for

Error

_DFM_

Train: Model

Degrees of

Freedom

_DFT_

Train: Total

Degrees of

Freedom

_DIV_

Train: Divisor

for ASE

_ERR_

Train: Error

Function

_FPE_

Train: Final

Prediction Error

_MAX_

Train:

Maximum

Absolute Error

_MSE_

Train: Mean

Square Error

_NOBS_

Train: Sum of

Frequencies

_NW_

Train: Number

of Estimate

Weights

_RASE_

Train: Root

Average Sum of

Squares

_RFPE_

Train: Root

Final Prediction

Error

_RMSE_

Train: Root

Mean Squared

Error

_SBC_

Train: Schwarz's

Bayesian

Criterion

_SSE_

Train: Sum of

Squared Errors

_SUMW_

Train: Sum of

Case Weights

Times

Frequency

_MISC_

Train:

Misclassification

Rate

Fit Statistics for the Test Data

Fit Statistic

Test Data

_TASE_

Test: Average

Squared Error

Yüklə 3,07 Mb.

Dostları ilə paylaş:

1 ... 65 66 67 68 69 70 71 72 ... 148