The arboretum procedure



Yüklə 3,07 Mb.
Pdf görüntüsü
səhifə34/148
tarix30.04.2018
ölçüsü3,07 Mb.
#40673
1   ...   30   31   32   33   34   35   36   37   ...   148

The DECIDE Procedure

TARGET Statement

Specifies which variable in the DECDATA= data set contains values for the target variable.

Discussion: The DECIDE procedure will search for a target variable with the same name in the

DATA= data set. If none is found, then the DECIDE procedure will assume that the

actual target values are unknown. For a categorical target, the target variables in the

DATA= and DECDATA= data sets need not be of the same type because the normalized

formatted values are used for comparisons. For an interval target, both variables must be

numeric. If scoring code is generated by a CODE statement, the code will format the

target variable using the format and length from the DATA= data set.

Tip: The TARGET statement is required.

TARGET variable;

variable

Specifies the variable in the DECDATA= data set that contains the values for the target variable.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



The DECIDE Procedure

Details

Note:   Formulas for adjusting posterior probabilities and for decision processing are given in the chapter

on "Predictive Modeling."  

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



The DECIDE Procedure

Example

Example 1: Using the DECIDE Procedure Following the DISCRIM Procedure

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.




The DECIDE Procedure

Example 1: Using the DECIDE Procedure Following

the DISCRIM Procedure

This example shows how to use the DECIDE procedure to adjust posterior probabilities from the

DISCRIM procedure, and how to make decisions using a revenue matrix and cost constants.

In a population of men who consult urologists for prostate problems, 70% have benign enlargement of

the prostate, 25% have an infection, and 5% have cancer. A sample of 100 men is taken, and two new

diagnostic measures, X and Y, are made on each patient. The training set also includes the diagnosis

made by reliable, conventional methods. For each patient, two treatments are available: 1) Antibiotics

are effective against infection, but may have moderately bad side effects. Antibiotics have no effect on

benign enlargement or cancer. 2) Surgery is effective for all diseases but has potentially severe side

effects such as impotence. There is also the option of doing nothing.



Note:   This example is purely fictional. Any resemblance to actual medical conditions or treatments is

coincidental.   

 

data prostate;



   length dx $10;

   dx='Benign';

   mx=30; sx=10;

   my=30; sy=10;

   n=70;

   link generate;

   dx='Infection';

   mx=70; sx=20;

   my=35; sy=15;

   n=25;


   link generate;

   dx='Cancer';

   mx=50; sx=10;

   my=50; sy=15;

   n=5;

   link generate;



   stop;

generate:




   do i=1 to n;

      x=rannor(12345)*sx+mx;

      y=rannor(0)    *sy+my;

      output;

   end;

run;


title2 'Diagnosis';

proc plot data=prostate; plot y*x=dx; run;

 

proc discrim data=prostate out=outdis short;



   class dx;

   var x y;

run;

title2 'Classification with equal priors';



proc plot data=outdis; plot y*x=_into_; run;

 

data rx(type=revenue);



   input dx $10. eqprior prior nothing antibiot surgery;

   datalines;

Benign            .33      70     0        0        5

Infection         .33      25     0       10       10

Cancer            .33       5     0        0      100

;

 



proc decide data=outdis out=outdec outstat=sumdec;

   target dx;

   posteriors benign infectio cancer;

   decision decdata=rx

            oldpriorvar=eqprior priorvar=prior

            decvars=nothing antibiot surgery

            cost=     0       5       20;

run;


title2 'Treatment: Cost of surgery=20';

proc print data=sumdec label; run;

proc plot data=outdec; plot y*x=d_rx; run;



 

proc decide data=outdis out=outdec2 outstat=sumdec2;

   target dx;

   posteriors benign infectio cancer;

   decision decdata=rx

            oldpriorvar=eqprior priorvar=prior

            decvars=nothing antibiot surgery

            cost=     0       5       50;

run;

title2 'Treatment: Cost of surgery=50';



proc print data=sumdec2 label; run;

proc plot data=outdec2; plot y*x=d_rx; run;

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



 

  Use DISCRIM to see how well inputs X and Y can classify each patient

according to disease.

proc discrim data=prostate out=outdis short;

   class dx;

   var x y;

run;

title2 'Classification with equal priors';



proc plot data=outdis; plot y*x=_into_; run;


 

The following DATA step creates a decision data set containing prior

probabilities and a revenue matrix. The revenue matrix  indicates the benefit

of each treatment. The costs of each  treatment (such as bad side effects)

will be specified later in a DECISION statement.  

The variables are:  EQPRIOR = The prior probabilities used by DISCRIM

(equal, by default), PRIOR  =  The known proportions from the population,

NOTHING = The benefit of doing nothing (0), ANTIBIOT = The benefit of using

antibiotics (cures infection, no benefit for other diseases), and SURGERY

= The benefit of surgery (cures all diseases) 

data rx(type=revenue);

   input dx $10. eqprior prior nothing antibiot surgery;

   datalines;

Benign            .33      70     0        0        5

Infection         .33      25     0       10       10

Cancer            .33       5     0        0      100

;



Yüklə 3,07 Mb.

Dostları ilə paylaş:
1   ...   30   31   32   33   34   35   36   37   ...   148




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə