The arboretum procedure

§ PROC DMNEURL: Approximation to PROC NEURAL

Yüklə 3,07 Mb.

Pdf görüntüsü

səhifə	48/148
tarix	30.04.2018
ölçüsü	3,07 Mb.
	#40673

1 ... 44 45 46 47 48 49 50 51 ... 148

PROC DMNEURL: Approximation to PROC NEURAL

set of about 8 activation functions and select the best result. Since the optimization

processes for different activation functions do not depend on each other, the computer

time could be reduced greatly by parallel processing.

Except for applications where PROC NEURAL would hit a local solution much

worse than the global solution, it is not expected that PROC DMNEURL can beat

PROC NEURAL in the precision of the prediction. However, for the applications we

have run until now we found the results of PROC DMNEURL very close to those of

PROC NEURAL. PROC DMNEURL will be faster than PROC NEURAL only for

very large data sets. For small data sets, PROC NEURAL could be much faster than

PROC DMNEURL, especially for an interval target. The most efﬁcient application

of PROC DMNEURL is the analysis of a binary target variable without FREQ and

WEIGHT statement and without COST variables in the input data set.

Application: HMEQ Data Set:

Binary Target BAD

To illustrate the use of PROC DMNEURL we choose the HMEQ data set:

libname sampsio ’/sas/a612/dmine/sampsio’;

proc dmdb batch data=sampsio.hmeq out=dmdbout dmdbcat=outcat;

var LOAN MORTDUE VALUE YOJ DELINQ CLAGE NINQ CLNO DEBTINC;

class BAD(DESC) REASON(ASC) JOB(ASC) DEROG(ASC);

target BAD;

run;

When selecting the binary target variable BAD a typical run of PROC DMNEURL

would be the following:

proc dmneurl data=dmdbout dmdbcat=outcat

outclass=oclass outest=estout out=dsout outfit=ofit

ptable maxcomp=3 maxstage=5;

var LOAN MORTDUE VALUE REASON JOB YOJ DEROG DELINQ

CLAGE NINQ CLNO DEBTINC;

target BAD;

run;

The number of parameters

estimated in each stage of the optimization is

, where

is the number of components that is selected at the stage. Since here

is speciﬁed with the MAXCOMP= option each optimization process estimates

only

parameters.

First some general information is printed and the four moments of the numeric data

set variables involved in the analysis:

The DMNEURL Procedure

Binary Target

BAD

Number Observations

5960

NOBS w/o Missing Target

5960

Purpose of PROC DMNEURL

Link Function

LOGIST

Selection Criterion

SSE

Optimization Criterion

SSE

Estimation Stages

5

Max. Number Components

3

Minimum R2 Value

0.000050

Number Grid Points

17

Response Profile for Target: BAD

Level

Nobs

Frequency

Weight

1

1189

1189

1189.000000

0

4771

4771

4771.000000

Variable

Mean

Std Dev

Skewness

Kurtosis

LOAN

18608

11207

2.02378

6.93259

MORTDUE

67350

44458

1.81448

6.48187

VALUE

99863

57386

3.05334

24.36280

YOJ

8.15130

7.57398

0.98846

0.37207

DELINQ

0.40570

1.12727

4.02315

23.56545

CLAGE

170.47634

85.81009

1.34341

7.59955

NINQ

1.08456

1.72867

2.62198

9.78651

CLNO

20.50285

10.13893

0.77505

1.15767

DEBTINC

26.59885

8.60175

2.85235

50.50404

For the ﬁrst stage we select three eigenvectors corresponding to the 4th, 11th, and

2nd largest eigenvalues. Obviously, there is no relationship between

the

value which measures the prediction of the response (target) variable by

each eigenvector

and the eigenvalue corresponding to each eigenvector which measures the vari-

ance explained in the

data matrix.

Therefore, the eigenvalues are not used in the analysis of PROC DMNEURL and are

printed only for curiosity.

Component Selection: SS(y) and R2 (SS_total=4771)

Comp

Eigval

R-Square

F Value

p-Value

SSE

4

9397.769045

0.017419

105.640645

<.0001

4687.893424

11

6327.041282

0.006317

38.550835

<.0001

4657.755732

2

13164

0.005931

36.408247

<.0001

4629.461194

The optimization history indicates a maximum of 11 iterations for the activation func-

tion LOGIST:

Yüklə 3,07 Mb.

Dostları ilə paylaş:

1 ... 44 45 46 47 48 49 50 51 ... 148