is labeled "Previous Modified Gradient"
is labeled "Previous Change"
is labeled "Alpha"
is labeled "Epsilon"
is labeled "Current Change"
is labeled "Current Weight"
References
Fahlman, S.E. (1989), "Faster-Learning Variations on Back-Propagation: An Empirical Study", in
Touretzky, D., Hinton, G., and Sejnowski, T., eds., Proceedings of the 1998 Connectionist
Models Summer School, Morgan Kaufman, 38-51.
Riedmiller, M. and Braun, H. (1993), "A Direct Adaptive Method for Faster Backpropagation
Learning: The RPROP Algorithm", Proceedings of the IEEE International Conference on
Neural networks 1993, San Francisco: IEEE.
Schiffmann, W., Joost, M., and Werner, R. (1994), "Optimization of the Backpropagation
Algorithm for Training Multilayer Perceptrons",
ftp://archive.cis.ohiostate.edu/pub/neuroprose/schiff.bp_speedup.ps.Z.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The NEURAL Procedure
Examples
The following examples were executed using the HP-UX version 10.20 operating system and the SAS
software release 6.12TS045.
Example 1: Developing a Simple Multilayer Perceptron (Rings Data)
Example 2: Developing a Neural Network for a Continuous Target
Example 3: Neural Network Hill-and-Plateau Example (Surf Data)
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The NEURAL Procedure
Example 1: Developing a Simple Multilayer Perceptron (Rings Data)
Features
Specifying Input, Hidden, and Output layers
q
Scoring a Test Data Set
q
Outputting Fit Statistics
q
Creating a Classification Table
q
Creating Contour Plots of the Posterior Probabilities
q
This example demonstrates how to develop a multilayer perceptron network with three hidden units. The example training data set is named
SAMPSIO.DMDRING (rings data). It contains a categorical target (C= 0, 1, or 2) plus two continuous inputs (X and Y). There are 180 cases in
the data set. The SAMPSIO.DMSRING data set is scored using the scoring formula from the trained models.
Both data sets and the SAMPSIO.DMDRING catalog are stored in the sample library.
Program
proc gplot data=sampsio.dmdring;
plot y*x=c /haxis=axis1 vaxis=axis2;
symbol c=black i=none v=dot;
symbol2 c=red i=none v=square;
symbol3 c=green i=none v=triangle;
axis1 c=black width=2.5 order=(0 to 30 by 5);
axis2 c=black width=2.5 minor=none order=(0 to 20 by 2);
title 'Plot of the Rings Training Data';
run;
proc neural data=sampsio.dmdring
dmdbcat=sampsio.dmdring
random=789;
input x y / level=interval id=i;
target c / id=o level=nominal;
hidden 3 / id=h;
prelim 5;
train;
score out=out outfit=fit;
score data=sampsio.dmsring out=gridout;
title 'MLP with 3 Hidden Units';
run;
proc print data=fit noobs label;
var _aic_ _ase_ _max_ _rfpe_ _misc_ _wrong_;
where _name_ = 'OVERALL';
title2 'Fits Statistics for the Training Data Set';
run;
proc freq data=out;
tables f_c*i_c;
title2 'Misclassification Table';
run;
proc gplot data=out;
plot y*x=i_c /haxis=axis1 vaxis=axis2;
symbol c=black i=none v=dot;
symbol2 c=black i=none v=square;
symbol3 c=black i=none v=triangle;
axis1 c=black width=2.5 order=(0 to 30 by 5);
axis2 c=black width=2.5 minor=none order=(0 to 20 by 2);
title2 'Classification Results';
run;
proc gcontour data=gridout;
plot y*x=p_c1 / pattern ctext=black coutline=gray;
plot y*x=p_c2 / pattern ctext=black coutline=gray;;
plot y*x=p_c3 / pattern ctext=black coutline=gray;;
pattern v=msolid;
legend frame;
title2 'Posterior Probabilities';
run;
proc gcontour data=gridout;
plot y*x=h1 / pattern ctext=black coutline=gray;
plot y*x=h2 / pattern ctext=black coutline=gray;;
plot y*x=h3 / pattern ctext=black coutline=gray;;
pattern v=msolid;
legend frame;
title2 'Hidden Unit Values';
run;
Output
PROC GPLOT of the Rings Training Data
Notice that the target classes are not linearly separable.
PROC NEURAL: Preliminary Training Output
This section lists the objective function for each preliminary iteration. The parameter estimates (weights) from the iteration number that has the
smallest objective function are passed as input for final training. Because the target is nominal, the error function is set to multiple Bernoulli.
Therefore, the objective function that is being minimized is the negative log-likelihood. Iteration number 4 has the smallest objective function.
The 17 initial parameter estimates are also listed in this section of the output.
MLP with 3 Hidden Units
Iteration Pseudo-random Objective
number number seed function
0 789 0.35192
1 761237432 0.20067
2 1092694980 0.18602
3 577625332 0.29195
4 261548896 0.15312
Optimization Start
Parameter Estimates
------------------------------------------------------------------------------
Parameter Estimate Gradient Label
------------------------------------------------------------------------------
1 X_H1 -0.399940 -0.00596 X -> H1
2 Y_H1 -2.215319 0.01670 Y -> H1
3 X_H2 2.570511 -0.03857 X -> H2
4 Y_H2 0.672317 -0.02087 Y -> H2
5 X_H3 2.589547 0.01907 X -> H3
6 Y_H3 -1.945493 0.00149 Y -> H3
7 BIAS_H1 2.153111 0.01586 BIAS -> H1
8 BIAS_H2 2.276595 0.04635 BIAS -> H2
9 BIAS_H3 -2.243021 0.00979 BIAS -> H3
10 H1_C1 5.688000 0.00208 H1 -> C1
11 H2_C1 5.828867 0.0008060 H2 -> C1
Dostları ilə paylaş: |