 # The arboretum procedure

Yüklə 3,07 Mb.

 səhifə 69/148 tarix 30.04.2018 ölçüsü 3,07 Mb.
 Chapter Contents Previous Next The DMREG Procedure Details Input The input to the DMREG procedure can be assigned one of these roles: Training The DATA= data set is used to fit the initial model. Validation The VALIDATA= data set is used to compute assessment statistics and to fine-tune the model during stepwise selection. Test The TESTDATA= data set is an additional "hold out" data set that you can use to compute assessment statistics. Score The DATA= data set in the SCORE statement is used for predicting target values for a new data set that may not contain the target. Specification of Effects Different types of effects can be used in the DMREG procedure. In the following list, assume that A, B, and C are class variables and that X1, X2, and Y are continuous variables: Regressor effects are specified by writing continuous variables individually: X1 X2 1.   Polynomial effects are specified by joining two or more continuous variables with asterisks: X1*X1 X1*X2 2.   Main effects are specified by writing class variables individually: AC 3.   Crossed effects (interactions) are specified by joining class variables with asterisks: A*BB*CA*B*C 4.   Continuous-by-class effects are written by joining continuous variables and class variables with asterisks: X1*A. 5.   Note:   Nested effects are not supported.   Optimization Methods The following table provides a list of the general nonlinear optimization methods and the default maximum number of iterations and function calls for each method. Optimization Methods for the Regression node. Optimization Method Maximum Iterations Maximum Function Calls Conjugate Gradient 400 1000 Double Dogleg 200 500 Newton-Raphson with Line Search 50 125 Newton-Raphson with Ridging 50 125 Quasi-Newton 200 500 Trust-Region 50 125 You should set the optimization method based on the size of the data mining problem, as follows: Small-to-medium problems - The Trust-Region, Newton-Raphson with Ridging, and Newton-Raphson with Line Search methods are appropriate for small and medium sized optimization problems (number of model parameters up to 40) where the Hessian matrix is easy and cheap to compute. Sometimes, Newton-Raphson with Ridging can be faster than Trust-Region, but Trust-Region is numerically more stable. If the Hessian matrix is not singular at the optimum, then the Newton-Raphson with Line Search can be a very competitive method. 1.   Medium Problems - The quasi-Newton and Double Dogleg methods are appropriate for medium optimization problems (number of model parameters up to 400) where the objective function and the gradient are must faster to compute than the Hessian. Quasi-Newton and Double Dogleg require more iterations than does the Trust-Region or the Newton-Raphson methods, but each iteration is much faster. 2.   Large Problems - The Conjugate Gradient method is appropriate for large data mining problems (number of model parameters greater than 400) where the objective function and the gradient are much faster to compute than the Hessian matrix, and where they need too much memory to store the approximate Hessian matrix. 3.   Note:   To learn about these optimization methods, see the SAS/OR Technical Report: The NLP Procedure (1997).    The underlying "Default" optimization entry method depends on the number of parameters in the model. If the number of parameters is less than or equal to 40, then the default method is set to Newton-Raphson with Ridging. If the number of parameters is greater than 40 and less than 400, then the default method is set to quasi-Newton. If the number of parameters is greater than 400, then Conjugate Gradient is the default method. Fit Statistics for OUTEST and OUTFIT Data Sets The OUTEST= data set in the PROC DMREG statement contains fit statistics for the training, test, and/or validation data. Depending on the ROLE= option in the SCORE statement, the OUTFIT= data set contains fit statistics for either the training, test, or validation data. Fit Statistics for the Training Data Fit Statistic Training Data _AIC_ Train: Akaike's Information Criterion _ASE_ Train: Average Squared Error _AVERR_ Train: Average Error Function _DFE_ Train: Degrees of Freedom for Error _DFM_ Train: Model Degrees of Freedom _DFT_ Train: Total Degrees of Freedom _DIV_ Train: Divisor for ASE _ERR_ Train: Error Function _FPE_ Train: Final Prediction Error _MAX_ Train: Maximum Absolute Error _MSE_ Train: Mean Square Error _NOBS_ Train: Sum of Frequencies _NW_ Train: Number of Estimate Weights _RASE_ Train: Root Average Sum of Squares _RFPE_ Train: Root Final Prediction Error _RMSE_ Train: Root Mean Squared Error _SBC_ Train: Schwarz's Bayesian Criterion _SSE_ Train: Sum of Squared Errors _SUMW_ Train: Sum of Case Weights Times Frequency _MISC_ Train: Misclassification Rate Fit Statistic Test Data _TASE_ Test: Average Squared Error Dostları ilə paylaş:

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət Ana səhifə