The DECIDE Procedure
TARGET Statement
Specifies which variable in the DECDATA= data set contains values for the target variable.
Discussion: The DECIDE procedure will search for a target variable with the same name in the
DATA= data set. If none is found, then the DECIDE procedure will assume that the
actual target values are unknown. For a categorical target, the target variables in the
DATA= and DECDATA= data sets need not be of the same type because the normalized
formatted values are used for comparisons. For an interval target, both variables must be
numeric. If scoring code is generated by a CODE statement, the code will format the
target variable using the format and length from the DATA= data set.
Tip: The TARGET statement is required.
TARGET variable;
variable
Specifies the variable in the DECDATA= data set that contains the values for the target variable.
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
Details
Note: Formulas for adjusting posterior probabilities and for decision processing are given in the chapter
on "Predictive Modeling."
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
Example
Example 1: Using the DECIDE Procedure Following the DISCRIM Procedure
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
The DECIDE Procedure
Example 1: Using the DECIDE Procedure Following
the DISCRIM Procedure
This example shows how to use the DECIDE procedure to adjust posterior probabilities from the
DISCRIM procedure, and how to make decisions using a revenue matrix and cost constants.
In a population of men who consult urologists for prostate problems, 70% have benign enlargement of
the prostate, 25% have an infection, and 5% have cancer. A sample of 100 men is taken, and two new
diagnostic measures, X and Y, are made on each patient. The training set also includes the diagnosis
made by reliable, conventional methods. For each patient, two treatments are available: 1) Antibiotics
are effective against infection, but may have moderately bad side effects. Antibiotics have no effect on
benign enlargement or cancer. 2) Surgery is effective for all diseases but has potentially severe side
effects such as impotence. There is also the option of doing nothing.
Note: This example is purely fictional. Any resemblance to actual medical conditions or treatments is
coincidental.
data prostate;
length dx $10;
dx='Benign';
mx=30; sx=10;
my=30; sy=10;
n=70;
link generate;
dx='Infection';
mx=70; sx=20;
my=35; sy=15;
n=25;
link generate;
dx='Cancer';
mx=50; sx=10;
my=50; sy=15;
n=5;
link generate;
stop;
generate:
do i=1 to n;
x=rannor(12345)*sx+mx;
y=rannor(0) *sy+my;
output;
end;
run;
title2 'Diagnosis';
proc plot data=prostate; plot y*x=dx; run;
proc discrim data=prostate out=outdis short;
class dx;
var x y;
run;
title2 'Classification with equal priors';
proc plot data=outdis; plot y*x=_into_; run;
data rx(type=revenue);
input dx $10. eqprior prior nothing antibiot surgery;
datalines;
Benign .33 70 0 0 5
Infection .33 25 0 10 10
Cancer .33 5 0 0 100
;
proc decide data=outdis out=outdec outstat=sumdec;
target dx;
posteriors benign infectio cancer;
decision decdata=rx
oldpriorvar=eqprior priorvar=prior
decvars=nothing antibiot surgery
cost= 0 5 20;
run;
title2 'Treatment: Cost of surgery=20';
proc print data=sumdec label; run;
proc plot data=outdec; plot y*x=d_rx; run;
proc decide data=outdis out=outdec2 outstat=sumdec2;
target dx;
posteriors benign infectio cancer;
decision decdata=rx
oldpriorvar=eqprior priorvar=prior
decvars=nothing antibiot surgery
cost= 0 5 50;
run;
title2 'Treatment: Cost of surgery=50';
proc print data=sumdec2 label; run;
proc plot data=outdec2; plot y*x=d_rx; run;
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Use DISCRIM to see how well inputs X and Y can classify each patient
according to disease.
proc discrim data=prostate out=outdis short;
class dx;
var x y;
run;
title2 'Classification with equal priors';
proc plot data=outdis; plot y*x=_into_; run;
The following DATA step creates a decision data set containing prior
probabilities and a revenue matrix. The revenue matrix indicates the benefit
of each treatment. The costs of each treatment (such as bad side effects)
will be specified later in a DECISION statement.
The variables are: EQPRIOR = The prior probabilities used by DISCRIM
(equal, by default), PRIOR = The known proportions from the population,
NOTHING = The benefit of doing nothing (0), ANTIBIOT = The benefit of using
antibiotics (cures infection, no benefit for other diseases), and SURGERY
= The benefit of surgery (cures all diseases)
data rx(type=revenue);
input dx $10. eqprior prior nothing antibiot surgery;
datalines;
Benign .33 70 0 0 5
Infection .33 25 0 10 10
Cancer .33 5 0 0 100
;
Dostları ilə paylaş: |