The arboretum procedure


Program: GLM Non-Full Rank (0, 1) Coding



Yüklə 3,07 Mb.

səhifə76/148
tarix30.04.2018
ölçüsü3,07 Mb.
1   ...   72   73   74   75   76   77   78   79   ...   148

Program: GLM Non-Full Rank (0, 1) Coding

 

data dumyhmeq;



   set hmeq;

   j_mgr=(job='Mgr');

   j_off=(job='Office');

   j_other=(job='Other');

   j_prof=(job='ProfExe');

   j_sales=(job='Sales');

   j_self=(job='Self');

run;


 

proc logistic data=dumyhmeq descending noprint;

   model bad = j_mgr j_off j_other j_prof j_sales j_self

               loan mortdue value yoj derog

               clage ninq clno debtinc;

   output out=logfit(keep=bad p_bad1) p=p_bad1;

title 'LOGISTIC Home Equity Data: GLM coding';

run;


proc dmdb batch data=hmeq

   out=dm_data dmdbcat=dm_cat;

   var loan mortdue value yoj derog

       clage ninq clno debtinc;

   class bad(desc)

         reason(asc)

         job(asc);

   target bad;

run;

 

proc dmreg data=dm_data



           dmdbcat=dm_cat

           noprint;

   class bad job;

   model bad = job loan mortdue value yoj derog

               clage ninq clno debtinc / coding=glm;

   score out=dmscore;

   title1 'DMREG Home Equity Data: GLM coding';

run;


proc compare data=dmscore compare=logfit note

             method=absolute




             criterion=1e-7;

   var p_bad1;

run;

Output: GLM Non-Full Rank (0, 1) Coding

PROC COMPARE results.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.



 

PROC FREQ step to create a classification table for the categorical

input JOB.

proc freq data=sampsio.hmeq;

   tables job / missing;

   title 'JOB Classification Table';

run;



 

SAS DATA step to replace the missing JOB values with the variable's

mode. It does not matter whether or not you perform data imputation prior

to modeling - DMREG and LOGISTIC will produce the same results if you

use the same method to code the class variables. Some of the continuous inputs

have missing values. DMREG and LOGISTIC do not use observations that have

missing values in the analysis. You can impute the missing values for the

continuous inputs by using the STDIZE procedure.

data hmeq;

  set sampsio.hmeq;

   if job = ' ' then job='Other';

run;



 

PROC TRANSREG step to create the design matrix for the classification

input JOB. The DESIGN option specifies that the goal is design matrix creation,

not analysis.

proc transreg data=hmeq design;



 

The MODEL statement specifies the class variable JOB. The DEVIATIONS

(or EFFECTS) t-option requests a deviations from the means coding.

  

   model class (job/deviations);




 

The ID statement also specifies to output the target and the continuous

inputs to the temporary design matrix data set. PROC TRANSREG automatically

creates the macro variable &_TRGIND that contains the list of independent

variables. This macro variable is used in the MODEL statement in PROC LOGISTIC.

   


   id bad loan mortdue value yoj derog clage ninq clno debtinc;

   output;

run;



 

You can also create the design matrix for the classification variable(s)

in a SAS DATA step although this task is too time consuming for databases

that contain several class variables. The DATA step is commented out, but

it does demonstrate how to manually code a categorical variable using the

deviations from the MEANS method.

/*

data dumyhmeq;



   set hmeq;

   select (job);

    when ('Mgr')

     do;


      j_mgr=1;

      j_off=0;

      j_other=0;

      j_prof=0;

      j_sales=0;

      j_self=-1;

     end;

   when ('Office')

    do;

      j_mgr=0;



      j_off=1;

      j_other=0;

      j_prof=0;

      j_sales=0;

      j_self=-1;

    end;


  when ('Other')

    do;


      j_mgr=0;

      j_off=0;

      j_other=1;

      j_prof=0;

      j_sales=0;

      j_self=-1;

    end;

   when ('ProfExe')

    do;

      j_mgr=0;



      j_off=0;

      j_other=0;

      j_prof=1;

      j_sales=0;

      j_self=-1;

    end;



 when ('Sales') 

    do;


      j_mgr=0;

      j_off=0;

      j_other=0;

      j_prof=0;

      j_sales=1;

      j_self=-1;

    end;

   when ('Self') 

    do;

      j_mgr=-1;



      j_off=-1;

      j_other=-1;

      j_prof=-1;

      j_sales=-1;

      j_self=-1;

    end;


     otherwise;

end;


run;

*/



 

PROC LOGISTIC step to model the binary target BAD. The macro variable

&_TRGIND obtains the classification design matrix from the subsequent

PROC TRANSREG run. The DESCENDING option causes the procedure to model the

probability that BAD = 1 (bad applicants).

proc logistic descending;

   model bad = &_trgind loan mortdue value yoj

               derog clage ninq clno debtinc;

  title 'LOGISTIC Home Equity Data: Deviations from the Mean Coding';

run;





Dostları ilə paylaş:
1   ...   72   73   74   75   76   77   78   79   ...   148


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2017
rəhbərliyinə müraciət

    Ana səhifə