OBS _SPLIT_ _PASS_ _NODE_ _PARENT_ _CHISQU_ _VALUE_ _LEVEL1_ _LEVEL2_
1 FREQUENT 1 1 0 92.3380 2.36 92.3380 .
2 STATECOD 2 2 1 29.4821 . 23.0000 31
3 DOMESTIC 2 3 1 30.8040 3.20 30.8040 .
4 JOB 3 4 2 11.5823 . 10.0000 4
5 STATECOD 3 5 2 13.4582 . 20.0000 11
6 STATECOD 3 6 3 30.8833 . 35.0000 18
7 STATECOD 3 7 3 29.7871 . 21.0000 24
8 JOB 4 8 4 7.2797 . 4.0000 6
9 STATECOD 4 9 4 9.9048 . 2.0000 9
10 HOMEVAL 4 10 5 7.0284 40000.00 7.0284 .
11 HOMEVAL 4 11 5 6.5551 240000.00 6.5551 .
12 STATECOD 4 12 6 8.2859 . 13.0000 22
13 JOB 4 13 6 8.8271 . 5.0000 9
14 STATECOD 4 14 7 7.1586 . 17.0000 4
15 APPAREL 4 15 7 11.8484 1.67 11.8484 .
16 HOMEVAL 5 17 8 6.0595 20000.00 6.0595 .
17 AGE 5 18 9 3.0000 34.60 3.0000 .
18 STATECOD 5 20 10 11.1158 . 5.0000 11
19 JOB 5 21 10 5.0285 . 10.0000 3
20 FREQUENT 5 22 11 6.4222 2.29 6.4222 .
PROC SPLIT Output
PROC PRINT Report of the Training Data Fit Statistics (OUTMATRIX=)
The report consists of the classification counts and proportions for the buyers ("Yes") and the non-buyers ("No"). You
can interpret the first two rows of the table as follows:
994 of the 999 actual buyers were correctly classified; only 5 buyers were incorrectly classified as non-buyers
q
959 of the 967 non-buyers were correctly classified; only 8 non-buyers were incorrectly classified as buyers.
q
The values in the STAT column enable you to identify the rows that pertain to counts (N), row and column percentages
(Row% and Col%), and overall percentages (%).
Import and Save Tree from DMSPLT
Training Statistics
==>
OBS STAT PURCHASE Yes ==> No TOTAL
1 N Yes 994 5 999
2 N No 8 959 967
3 N SUM 1002 964 1966
4 Row% Yes 99 1 100
5 Row% No 1 99 100
6 Row% SUM 51 49 100
7 Col% Yes 99 1 51
8 Col% No 1 99 49
9 Col% SUM 100 100 100
10 % Yes 51 0 51
11 % No 0 49 49
12 % SUM 51 49 100
Partial PROC PRINT Report of the Leaf Statistics (OUTLEAF=)
The leaf report contains the following information:
Leaf identification number
q
Number of customers in each leaf
q
Percentages of the binary target values in each leaf.
q
Notice the purity of the leaf nodes.
Import and Save Tree from DMSPLT
Leaf Statistics
LEAF
OBS ID N % Yes % No
1 16 9 100.00 0.00
2 54 10 100.00 0.00
3 55 1 0.00 100.00
4 142 2 0.00 100.00
5 143 2 100.00 0.00
6 89 8 100.00 0.00
7 332 2 100.00 0.00
8 333 8 0.00 100.00
9 416 12 100.00 0.00
10 417 1 0.00 100.00
PROC PRINT Report of the Score Fit Statistics (OUTFIT=)
The misclassification rate for the scored data set is almost zero. You can compare the maximum absolute error, sum of
squared errors, average squared error, and root average squared error from this tree with other candidate trees (models).
Small values for these test statistics are preferred.
Input Tree and Score Test Data
Test Data Fit Statistics
Test:
Frequency Test:
Test: Sum Test: Sum of of Frequency of Test:
of Weights Classified Unclassified Misclassification
OBS Frequencies Times Freqs Cases Cases Rate
1 150 300 150 0 .0066667
Test: Test: Test: Root
Maximum Test: Sum Average Average Test: