Absolute of Squared Squared Squared Divisor
OBS Error Errors Error Error for VASE
1 0.72727 1.52883 .0050961 0.071387 300
Misclassification Table for the Scored Data Set (OUT=)
Only one customer in the test data set was incorrectly classified. Ideally, you should use a mutually exclusive test data
set for validating the tree.
Input Tree and Score Test Data
Misclassification Table for the Test Data
TABLE OF F_PURCHA BY I_PURCHA
F_PURCHA(Formatted Target Value)
I_PURCHA(Predicted Category)
Frequency|
Percent |
Row Pct |
Col Pct |No |Yes | Total
----------------------------
No | 64 | 1 | 65
| 42.67 | 0.67 | 43.33
| 98.46 | 1.54 |
| 100.00 | 1.16 |
----------------------------
Yes | 0 | 85 | 85
| 0.00 | 56.67 | 56.67
| 0.00 | 100.00 |
| 0.00 | 98.84 |
----------------------------
Total 64 86 150
42.67 57.33 100.00
Partial PROC PRINT Report of the Score Summary Data Set
Input Tree and Score Test Data
Score Summary Data
Node Assessment Assessment: Assessment: Decision Formatted
Identification of PURCHASE PURCHASE Assigned Target
OBS Number Prediction = Yes = No to Case Value
1 21 0.58527 0.41473 0.58527 No No
2 21 0.58527 0.41473 0.58527 No Yes
3 128 0.89474 0.89474 0.10526 Yes Yes
4 21 0.58527 0.41473 0.58527 No No
5 128 0.89474 0.89474 0.10526 Yes Yes
6 21 0.58527 0.41473 0.58527 No No
7 33 0.53237 0.53237 0.46763 Yes Yes
8 120 1.00000 1.00000 0.00000 Yes Yes
9 33 0.53237 0.53237 0.46763 Yes Yes
10 49 1.00000 0.00000 1.00000 No No
Predicted: Predicted: Residual: Residual:
Predicted PURCHASE PURCHASE Predicted: PURCHASE PURCHASE Residual:
OBS Category = Yes = No PURCHASE = Yes = No PURCHASE
1 No 0.41473 0.58527 0.58527 -0.41473 0.41473 0.41473
2 No 0.41473 0.58527 0.58527 0.58527 -0.58527 -0.58527
3 Yes 0.89474 0.10526 0.89474 0.10526 -0.10526 0.10526
4 No 0.41473 0.58527 0.58527 -0.41473 0.41473 0.41473
5 Yes 0.89474 0.10526 0.89474 0.10526 -0.10526 0.10526
6 No 0.41473 0.58527 0.58527 -0.41473 0.41473 0.41473
7 Yes 0.53237 0.46763 0.53237 0.46763 -0.46763 0.46763
8 Yes 1.00000 0.00000 1.00000 0.00000 0.00000 0.00000
9 Yes 0.53237 0.46763 0.53237 0.46763 -0.46763 0.46763
10 No 0.00000 1.00000 1.00000 0.00000 0.00000 0.00000
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Before you analyze the data using the DMSPLIT procedure, you must create
the DMDB encoded data set and catalog. For more information about how to do
this, see "Example 1: Getting Started with the DMDB Procedure"
in the DMDB procedure documentation.
proc dmdb batch data=sampsio.dmexa1 out=dmbexa1 dmdbcat=catexa1;
id acctnum;
var amount income homeval frequent recency age
domestic apparel;
class purchase(desc) marital ntitle gender telind
origin job statecod numcars edlevel;
run;
The PROC DMSPLIT statement invokes the procedure. The DATA= option identifies
the DMDB encoded training data set that is used to fit the model. The DMDBCAT=
option identifies the DMDB training data catalog.
proc dmsplit data=dmbexa1 dmdbcat=catexa1
The BINS= option specifies the number of categories in which the range
of each interval variable is divided for splits.
bins=30
The CHISQ= option specifies a minimum bound for the Chi-Square value
that is still eligible for making a variable split. The value of CHISQ governs
the number of splits that are performed. As you increase the CHISQ value,
the procedure performs fewer splits and passes through the input data.
chisq=2.00