22
§
PROC DMNEURL: Approximation to PROC NEURAL
DATA=SASdataset :
specifies an input data set generated by PROC DMDB which is associated with
a valid catalog specified by the DMDBCAT= option. This option must be spec-
ified, no default is permitted. The DATA= data set must contain interval scaled
variables and CLASS variables in a specific form written by PROC DMDB.
DMDBCAT=SAScatalog :
specifies an input catalog of meta information generated by PROC DMDB
which is assiciated with a valid data set specified by the DATA= option. The
catalog contains important information (e.g. range of variables, number of
missing values of each variable, moments of variables) which is used by many
other procedures which require a DMDB data set. That means, that both, the
DMDBCAT= catalog and the DATA= data set must be InSync to obtain proper
results! This option must be specified, no default is permitted.
TESTDATA=SASdataset :
specifies a second input data set which is by default NOT generated by PROC
DMDB, which however must contain all variables of the DATA= input data
set which are used in the model. The variables not used in the model may be
different. The order of variables is not relevant. If TESTDATA= is specified,
you can specify a TESTOUT= output data set (containing predicted values and
residuals) which relates to the TESTDATA= input data set the same as the
OUT= data set relates to the DATA= input training data set. When specifying
the TESTDMDB option you may use a data set generated by PROC DMDB as
the TESTDATA= input data set.
OUTCLASS=SASdataset :
specifies an output data set generated by PROC DMNEURL which contains
the mapping inbetween compound variable names and the names of variables
and categories of CLASS variables used in the model. The compound variable
names are used to denote dummy variables which are created for each category
of a CLASS variable with more than two categories. Since the compound
names of dummy variables are used for variable names in other data sets the
user must know to which category each compound name corresponds. The
OUTCLASS= data set has only three character variables
– NAME– contains compound name used as variable names in other output
data sets
– VAR– contains variable name used in DATA= input data set
– LEVEL– contains level name of variable as used in DATA= input data set.
Note, if the DATA= input data set does not contain any CLASS variables the
OUTCLASS= data set is not written.
OUTEST=SASdataset :
specifies an output data set generated by PROC DMNEURL which contains all
the model information necessary for scoring additional cases or data sets.
Variables of the output data set:
– TARGET– (character) name of the target
Purpose of PROC DMNEURL
§
23
– TYPE– (character) type of observation
– NAME– (character) name of observation
– STAGE– number of stage
– MEAN– contains different numeric information
– STDEV– contains different numeric information
Ú
Ö
Ò
Ñ
variables in the model variables; the first variables correspond to
CLASS (categorical) the remaining variables are continuously (interval
or ratio) scaled. Note, that for nonbinary CLASS (nominal or ordinal
categorical) variables a set of binary dummy variables is created. In those
cases the prefix of variable names
Ú
Ö
Ò
Ñ
used for a group of variables
in the data set may be the same for a successive group of variables which
differs only by a numeric suffix.
This data set contains all the model information necessary to compute the pre-
dicted model values (scores).
1. The –TYPE–=–V–MAP– and –TYPE–=–C–MAP– observations con-
tain the mapping indices between the variables used in the model and the
number of the variable in the data set.
¯
The –MEAN– variable contains the number of index mappings.
¯
The –STDEV– variable contains the index of the target (response)
variable in the data set for the –TYPE–=–V–MAP– observation.
For –TYPE–=–C–MAP– it contains the level (category) number of
a categorical target variable that corresponds to missing values.
2. The –TYPE–=–EIGVAL– observation contains the sorted eigenvalues
of the
¼
matrix. Here, the –MEAN– variable contains the number
of model variables (rows/columns of the model
¼
matrix) and the
– STDEV– variable contains the number
of model components.
3. For each stage of the estimation process two groups of observations are
written to the OUTEST= data set:
(a) The –TYPE–=–EIGVEC– observations contain a set of principal
components which are used as predictor variables for the estimation
of the original traget value
Ý
(in stage 0) or for the prediction of the
stage
residual. Here, the –MEAN– variable contains the value for
the criterion used to include the component into the model which is
normally the
Ê
¾
value. The –STDEV– variable contains the eigen-
value number to which the eigenvector corresponds.
1
SQUARE
´
·
£
ܵ
£
Ü
2
TANH
£
Ø
Ò
´
£
ܵ
3
ARCTAN
£
Ø
Ò´
£
ܵ
4
LOGIST
ÜÔ´
£
ܵ
´½
·
ÜÔ´
£
ܵ
5
GAUSS
£
ÜÔ´ ´
£
ܵ
¾
µ
6
SIN
£
×
Ò´
£
ܵ
7
COS
£
Ó×´
£
ܵ
8
EXP
£
ÜÔ´
£
ܵ
The –NAME– variable reports the corresponding name of the best
activation function found.