The arboretum procedure



Yüklə 3.07 Mb.

səhifə69/148
tarix30.04.2018
ölçüsü3.07 Mb.
1   ...   65   66   67   68   69   70   71   72   ...   148

Chapter Contents

Previous


Next

The DMREG Procedure

Details

Input

The input to the DMREG procedure can be assigned one of these roles:

Training

The DATA= data set is used to fit the initial model.

Validation

The VALIDATA= data set is used to compute assessment statistics and to fine-tune the model

during stepwise selection.

Test


The TESTDATA= data set is an additional "hold out" data set that you can use to compute

assessment statistics.

Score

The DATA= data set in the SCORE statement is used for predicting target values for a new data



set that may not contain the target.

Specification of Effects

Different types of effects can be used in the DMREG procedure. In the following list, assume that A, B,

and C are class variables and that X1, X2, and Y are continuous variables:

Regressor effects are specified by writing continuous variables individually:

X1 X2

1.  


Polynomial effects are specified by joining two or more continuous variables with asterisks:

X1*X1 X1*X2

2.  

Main effects are specified by writing class variables individually:



AC

3.  


Crossed effects (interactions) are specified by joining class variables with asterisks:

A*BB*CA*B*C

4.  

Continuous-by-class effects are written by joining continuous variables and class variables with



asterisks:

X1*A.


5.  

Note:   Nested effects are not supported.  


Optimization Methods

The following table provides a list of the general nonlinear optimization methods and the default

maximum number of iterations and function calls for each method.

Optimization Methods for the Regression

node.

Optimization

Method

Maximum

Iterations

Maximum

Function

Calls

Conjugate

Gradient

400


1000

Double Dogleg

200

500


Newton-Raphson

with Line Search

50

125


Newton-Raphson

with Ridging

50

125


Quasi-Newton

200


500

Trust-Region

50

125


You should set the optimization method based on the size of the data mining problem, as follows:

Small-to-medium problems - The Trust-Region, Newton-Raphson with Ridging, and

Newton-Raphson with Line Search methods are appropriate for small and medium sized

optimization problems (number of model parameters up to 40) where the Hessian matrix is easy

and cheap to compute. Sometimes, Newton-Raphson with Ridging can be faster than

Trust-Region, but Trust-Region is numerically more stable. If the Hessian matrix is not singular at

the optimum, then the Newton-Raphson with Line Search can be a very competitive method.

1.  


Medium Problems - The quasi-Newton and Double Dogleg methods are appropriate for medium

optimization problems (number of model parameters up to 400) where the objective function and

the gradient are must faster to compute than the Hessian. Quasi-Newton and Double Dogleg

require more iterations than does the Trust-Region or the Newton-Raphson methods, but each

iteration is much faster.

2.  


Large Problems - The Conjugate Gradient method is appropriate for large data mining problems

(number of model parameters greater than 400) where the objective function and the gradient are

much faster to compute than the Hessian matrix, and where they need too much memory to store

the approximate Hessian matrix.

3.  

Note:   To learn about these optimization methods, see the SAS/OR Technical Report: The NLP

Procedure (1997).   

The underlying "Default" optimization entry method depends on the number of parameters in the model.




If the number of parameters is less than or equal to 40, then the default method is set to Newton-Raphson

with Ridging. If the number of parameters is greater than 40 and less than 400, then the default method is

set to quasi-Newton. If the number of parameters is greater than 400, then Conjugate Gradient is the

default method.



Fit Statistics for OUTEST and OUTFIT Data Sets

The OUTEST= data set in the PROC DMREG statement contains fit statistics for the training, test,

and/or validation data. Depending on the ROLE= option in the SCORE statement, the OUTFIT= data set

contains fit statistics for either the training, test, or validation data.



Fit Statistics for the Training

Data

Fit

Statistic

Training Data

_AIC_


Train: Akaike's

Information

Criterion

_ASE_


Train: Average

Squared Error

_AVERR_ Train: Average

Error Function

_DFE_

Train: Degrees



of Freedom for

Error


_DFM_

Train: Model

Degrees of

Freedom


_DFT_

Train: Total

Degrees of

Freedom


_DIV_

Train: Divisor

for ASE

_ERR_


Train: Error

Function


_FPE_

Train: Final

Prediction Error



_MAX_

Train:


Maximum

Absolute Error

_MSE_

Train: Mean



Square Error

_NOBS_


Train: Sum of

Frequencies

_NW_

Train: Number



of Estimate

Weights


_RASE_

Train: Root

Average Sum of

Squares


_RFPE_

Train: Root

Final Prediction

Error


_RMSE_

Train: Root

Mean Squared

Error


_SBC_

Train: Schwarz's

Bayesian

Criterion

_SSE_

Train: Sum of



Squared Errors

_SUMW_


Train: Sum of

Case Weights

Times

Frequency



_MISC_

Train:


Misclassification

Rate


Fit Statistics for the Test Data

Fit Statistic

Test Data

_TASE_


Test: Average

Squared Error






Dostları ilə paylaş:
1   ...   65   66   67   68   69   70   71   72   ...   148


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə