The DMINE Procedure
WEIGHT Statement
Alias: WEIGHTS
Tip: Specify the WEIGHT variable in PROC DMDB so that the information is saved in the catalog
and so that the variable is used automatically as a WEIGHT variable in PROC DMINE.
WEIGHT variable;
Required Argument
variable
Specifies one numeric (interval-scaled) variable that is used to weight the input variables.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DMINE Procedure
Details
PROC DMINE performs the following two tasks:
PROC DMINE first computes a forward stepwise least-squares regression. In each step, an
independent variable is selected, which contributes maximally to the model R-square value. Two
parameters, MINR2 and STOPR2, can be specified to guide the variable selection process.
MINR2
If a variable has an individual R-square value smaller than MINR2, the variable is not
considered for selection into the model.
STOPR2
A second test is performed using the STOPR2 value: the remaining independent variable
with the largest contribution to the model R-square is added to the model. If the resulting
global R-square value changes from its former value by less than the STOPR2 value, then
the stepwise regression is terminated.
1.
For a binary target (CLASS response variable), a fast algorithm for (approximate) logistic
regression is computed in the second part of PROC DMINE. The independent variable is the
prediction from the former least squares regression. Since only one regression variable is used in
the logistic regression, only two parameters are estimated, the intercept and slope. The range of
predicted values is divided into a number of equidistant intervals (knots), on which the logistic
function is interpolated.
If NOPRINT is not specified, a table is printed indicating the accuracy of the prediction of the
target.
2.
Missing Values
Missing values are handled differently, depending on the type of variable.
Missing values in categorical variables are replaced with a new category that represents missing
values.
q
Missing values in noncategorical variables are replaced with the mean.
q
Observations with missing target values are dropped from the data.
q
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DMINE Procedure
Examples
The following examples were executed on the Windows NT operating system; the version of the SAS
System was 6.12TS045.
Example 1: Modeling a Continuous Target with the DMINE Procedure (Simple Selection
Settings)
Example 2: Including the AOV16 and Grouping Variables into the Analysis (Detailed
Selection Settings)
Example 3: Modeling a Binary Target with the DMINE Procedure
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DMINE Procedure
Example 1: Modeling a Continuous Target with the DMINE
Procedure (Simple Selection Settings)
Features:
Setting the MINR2= and STOPR2= cutoff values.
q
Specifying the target and input variables.
q
Excluding the AOV16 variables by specifying the NOAOV16 option.
q
Excluding the two-way class interactions by specifying the NOINTER option.
q
As a marketing analyst at a catalog company, you want to quickly identify the inputs that best predict the dollar amount
that customers will purchase from your new fall outerwear catalog. The fictitious catalog mailing data set is named
SAMPSIO.DMEXA1 (stored in the sample library). The data set contains 1,966 customer cases. The interval target
AMOUNT contains the purchase amount in dollars.
There are 48 input variables available for predicting the target. Note that PURCHASE is a binary target that is modeled
in "Example 3: Modeling a Binary Target with the DMINE Procedure". ACCTNUM is an id variable, which is not a
suitable input variable.
Program
proc dmdb batch data=sampsio.dmexa1 out=dmbexa1 dmdbcat=catexa1;
id acctnum;
var amount income homeval frequent recency age
domestic apparel leisure promo7 promo13 dpm12
county return mensware flatware homeacc lamps
linens blankets towels outdoor coats wcoat
wappar hhappar jewelry custdate numkids travtime job;
class purchase(desc) marital ntitle gender telind
aprtmnt snglmom mobile kitchen luxury dishes tmktord
statecod race origin heat numcars edlevel;
run;
proc dmine data=dmbexa1 dmdbcat=catexa1
minr2=0.020 stopr2=0.0050
noaov16
nointer;
var income homeval frequent recency age
domestic apparel leisure promo7 promo13 dpm12
county return mensware flatware homeacc lamps
linens blankets towels outdoor coats wcoat
wappar hhappar jewelry custdate numkids travtime job
marital ntitle gender telind aprtmnt snglmom mobile
kitchen luxury dishes tmktord statecod race origin heat
numcars edlevel;
target amount;
title 'DMINE: Continuous Target';
run;
Output
DMINE Status Monitor
When you invoke the DMINE procedure, the Dmine Status Monitor window appears. This window monitors the
execution time of the procedure. To suppress the display of this window, specify the NOMONITOR option on the PROC
DMINE statement.