The arboretum procedure

Yüklə 3.07 Mb.

ölçüsü3.07 Mb.
1   ...   102   103   104   105   106   107   108   109   ...   148
: documentation
documentation -> From cyber-crime to insider trading, digital investigators are increasingly being asked to
documentation -> EnCase Forensic Transform Your Investigations
documentation -> File Sharing Documentation Prepared by Alan Halter Created: 1/7/2016 Modified: 1/7/2016
documentation -> Gaia Data Release 1 Documentation release 0

    5 NO_BB_HU    -0.097288 5.56801E-6 NO_BB -> HU1                            

    6 _DUP2_      -0.159583 2.38931E-6 DIVISIONEAST -> HU1                     

    7 BIAS_HU1     4.100399 1.23564E-6 BIAS -> HU1                             

    8 _DUP3_       0.114473 -4.1311E-8 CR_HITS -> LOGSALAR                     

    9 _DUP4_       0.186717 5.68816E-9 NO_HITS -> LOGSALAR                     

   10 _DUP5_       0.156385 1.26352E-9 NO_OUTS -> LOGSALAR                     

   11 _DUP6_      -0.042475  4.3267E-8 NO_ERROR -> LOGSALAR                    

   12 NO_BB_LO     0.151513 2.33439E-9 NO_BB -> LOGSALAR                       

   13 _DUP7_       0.055178  1.6828E-8 DIVISIONEAST -> LOGSALAR                

   14 HU1_LOGS     0.839144 -4.7461E-8 HU1 -> LOGSALAR                         

   15 BIAS_LOG     5.490961 3.77028E-8 BIAS -> LOGSALAR                        

        Value of Objective Function = 0.1453574605

List Report of Selected Variables in the Score OUTFIT= Data Set

The example PROC PRINT report of the OUTFIT= data set contains selected summary statistics from the scored training data set.

  Partial Listing of the Score OUTFIT= Data Set



                                Test: Mean     Test: Root Mean     Test: Maximum 

      _ITER_    _PNAME_        Squared Error.   Squared Error.     Absolute Error.

         0      P_LOGSAL          0.15595        0.39491            1.60237

Diagnostic Plots for the Scored Test Baseball Data

Plot of the log of salary versus the predicted log of salary.

Plot of the residual values versus the predicted log of salary.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.


The preliminary PROC DMREG run selects the reduced input set.

proc dmreg data=sampsio.dmdbase dmdbcat=sampsio.dmdbase

   testdata=sampsio.dmtbase outest=regest;

   class league division position;

   model logsalar = no_atbat no_hits no_home no_runs no_rbi no_bb

                    yr_major cr_atbat cr_hits cr_home cr_runs

                    cr_rbi cr_bb league division position no_outs

                    no_assts no_error /

                    error=normal selection=stepwise 

                    slentry=0.25 slstay=0.25 choose=sbc;

   title1 'Preliminary DMDREG Stepwise Selection';



The PROC NEURAL statement invokes the procedure. The DATA= option identifies

the training data set that is used to fit the model. The DMDBCAT= option identifies

the training catalog. The RANDOM= option specifies the seed that is used to

set the random initial weights.

proc neural data=sampsio.dmdbase 




The INPUT statements specifies the input layers. There are separate

input layers for the interval and nominal inputs. The LEVEL= option specifies

the measurement level. The ID= option specifies an identifier for each input


   input cr_hits no_hits no_outs no_error no_bb

         / level=interval id=int;

   input division / level=nominal id=nom;


The HIDDEN statement sets the number of hidden units. The ID= option

specifies an identifier for the hidden layer. By default, the combination

function is set to linear and the activation function is set to hyperbolic


    hidden 1 / id=hu;   


The TARGET statement defines an output layer. The output layer computes

predicted values and compares those predicted values with the value of the

target variable (LOGSALAR). The LEVEL= option specifies the target measurement

level. The ID= option specifies an identifier for the output layer.  By default,

the combination function is set to linear, the activation function is set

to the identity, and the error function is set to normal for continuous targets.

target logsalar / 


                 id=tar ;


The CONNECT statements specify how to connect the layers. The id-list

specifies the identifier of two or more layers to connect.  In this example,

each input unit is connected to the hidden unit and to the output unit, and

the hidden unit is connected to the output unit.

    connect int tar;

    connect nom tar;

    connect int hu;

    connect nom hu;

    connect hu tar;


The PRELIM statement does preliminary training using 10 different sets

of initial weights. The weights from the preliminary run with the smallest

objective function among all runs are retained for subsequent training when

using the TRAIN statement. Preliminary training may help prevent the network

from converging to a bad local minima. 

   prelim 10;


The TRAIN statement trains the network in order to find the best weights

(parameter estimates) to fit the training data. By default, the Levenberg-Marquardt

optimization technique is used for small least squares networks, such as the

one in this example. 


Dostları ilə paylaş:
1   ...   102   103   104   105   106   107   108   109   ...   148

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2017
rəhbərliyinə müraciət

    Ana səhifə