Group: KITCHEN*RACE 6 0.0210
AOV16: TOWELS 10 0.0208
AOV16: OUTDOOR 11 0.0201
Class: GENDER*KITCHEN 16 0.0197 R2 < MINR2
Group: GENDER*KITCHEN 5 0.0195 R2 < MINR2
Class: LUXURY*DISHES 15 0.0193 R2 < MINR2
Group: LUXURY*DISHES 5 0.0190 R2 < MINR2
Class: APRTMNT*KITCHEN 15 0.0189 R2 < MINR2
Class: SNGLMOM*KITCHEN 15 0.0187 R2 < MINR2
Group: APRTMNT*KITCHEN 4 0.0186 R2 < MINR2
AOV16: HHAPPAR 13 0.0185 R2 < MINR2
AOV16: JEWELRY 10 0.0185 R2 < MINR2
Group: SNGLMOM*KITCHEN 4 0.0184 R2 < MINR2
Class: MOBILE*KITCHEN 15 0.0184 R2 < MINR2
Group: MOBILE*KITCHEN 5 0.0182 R2 < MINR2
AOV16: LAMPS 12 0.0178 R2 < MINR2
Class: KITCHEN 9 0.0174 R2 < MINR2
AOV16: LINENS 13 0.0173 R2 < MINR2
AOV16: PROMO13 15 0.0172 R2 < MINR2
Group: KITCHEN 3 0.0171 R2 < MINR2
AOV16: BLANKETS 11 0.0169 R2 < MINR2
Class: LUXURY*ORIGIN 11 0.0168 R2 < MINR2
Group: LUXURY*ORIGIN 5 0.0167 R2 < MINR2
Var: OUTDOOR 1 0.0166 R2 < MINR2
Class: DISHES*HEAT 24 0.0165 R2 < MINR2
Additional Effects Are Not Listed
SS and R2 Portion for Effects Chosen for Target
This section lists the chosen input variables from the forward stepwise regression. The table is divided into the following five
columns:
Effect lists the sequentially selected effects, which are ranked by the R-square statistic.
q
DF shows the degrees of freedom associated with each model effect.
q
R2 measures the sequential improvement in the model as input variables are selected. Multiply the R2 statistic by 100
to express it as a percentage. You can interpret the R2 statistic for the KITCHEN*STATECOD interaction as "10.45%
of the variation in the target PURCHASE is explained by its linear relationship with this effect". The R2 statistic for
NTITLE*STATECOD indicates that this two-factor interaction accounts for an additional 6.38% of the target
variation.
q
SS lists the sums of squares for each model effect.
q
EMS lists
the Error Mean Square, which measures variation due to either random error or to other inputs that are not in
the model. The EMS should get smaller as important inputs are added to the model.
q
DMINE: Binary Target
Effect DF R2
SS EMS
----------------------------------------------------------------------------------
Class: KITCHEN*STATECOD 197 0.1045 51.35546 0.2488769
Class: NTITLE*STATECOD 132 0.0683 33.55930 0.2484444
Class: STATECOD*ORIGIN 106 0.0660 32.43976 0.2444544
Class: DISHES*STATECOD 100 0.0473 23.21811 0.2453127
Class: STATECOD*EDLEVEL 83 0.0437 21.49171 0.2444732
AOV16: FREQUENT 9 0.0332 16.30766 0.2339296
Class: STATECOD*HEAT 73 0.0369 18.15072 0.2330807
Class: STATECOD*NUMCARS 55 0.0237 11.66560 0.2340343
Class: STATECOD*RACE 40 0.0228 11.18392 0.2324765
Class: LUXURY*STATECOD 37 0.0188 9.22242 0.2319286
Class: MARITAL*STATECOD 39 0.0172 8.45564 0.2324675
Group: TMKTORD*STATECOD 8 0.0093 4.55481 0.2299859
Var: RECENCY 1 0.0066 3.25424 0.2271985
The Final Anova Table for the Target
The ANOVA table is divided into the following four columns:
Effect labels the source of variation as Model, Error, or Total.
q
DF lists the degrees of freedom for each source of variation.
q
R2 is the model R2, which is the ratio of the model sums of squares (SS) to the total sums of squares. In this example,
the selected inputs collectively explain 49.83% of the total variability in the target PURCHASE.
q
SS partitions the total target variation into portions that can be attributed to the model inputs and to error.
q
The final ANOVA table for target: PURCHASE
Effect DF R2 SS
-------------------------------------------
Model 880 0.4983 244.85936
Error 1085 246.51043
Total 1965 491.36979
SS and R2 portion for Effects Not Chosen for the Target: PURCHASE
SS and R2 portion for Effects not chosen for target: PURCHASE
Effect DF R2 SS
---------------------------------------------------------------
Class: GENDER*STATECOD 0 0.0000 0.00000
Var: FREQUENT 1 0.0010 0.50991
Var: DOMESTIC 1 0.0012 0.57574
Estimating Logistic
When the target is binary, predicted values or SUPERX's are computed from the forward stepwise regression. The SUPERX's
are then grouped into 256 equally spaced intervals, which are used as the independent variable in a final logistic regression
analysis. The logistic regression helps you decide the cutoff of the binary response. Since there is one input, only two
parameters are estimated (the intercept and the slope).
The first table shows the iteration history for estimating the intercept (alpha) and the slope (beta) for the approximate logistic
regression.
The second table contains the predicted values, which are bucketed into the 256 equally sized sub-intervals. The table