Method=VARIANCE
Reduction in squared error from node means.
Method=PROBF
p-value of F-test associated with node variance. Default for INTERVAL.
Method=F
F statistic associated with node variance.
EXCLUDEMISS
Specifies that missing values be excluded during a split search.
EXHAUSTIVE=n
Specifies the most number of candidate splits to find in an exhaustive search. If more candidates
would have to be considered, a heuristic search is used instead. The EXHAUSTIVE option applies
to multi-way splits, and for binary splits on nominal targets with more than 2 values.
Default:
The default value is 5000.
INDMSPLIT
Requests that the tree created by PROC DMSPLIT be input to PROC SPLIT. The tree is expected
in the DMDBCAT= catalog. The DMDBCAT= option is required, and the INDMTREE and
INTREE= options are prohibited.
INTREE=SAS-tree-model
Names a data set created from the PROC SPLIT OUTTREE= option.
Caution:
When using the INTREE option, the IN, TARGET, and FREQ statements are
prohibited, as are the DECISION and PRIORS statements.
LEAFSIZE=n
Specifies the smallest number of training observations a node can have.
LIFTDEPTH=n
Specifies the proportion of observations to use with ASSESS=LIFT.
MAXBRANCH=n
Restricts the number of subsets a splitting rule can produce to n or fewer. A value of 2 results in
binary trees.
Range:
2 - 100
Default:
2
MAXDEPTH=depth
Specifies the maximum number of generations of nodes. The original node, generation 0, is called
the root node. The children of the root node are the first generation. PROC SPLIT will only
consider splitting nodes in the nth generation when n is less than the value of depth.
Default:
6
NODESAMPLE=n
Specifies the within node sample size used for finding splits. If the number of training
observations in a node is larger than n, then the split search for that node is based on a random
sample of size n.
Default:
5000
Range:
1 n 32767
NRULES=n
Specifies how many splitting rules are saved with each node. The tree only uses one rule. The
remaining rules are saved for comparison. Based on the criterion you selected, you can see how
well the variable that was used split the data, and how well the next n-1 would have split the data.
Default:
5
NSURRS=n
Specifies a number of surrogate rules sought in each non-leaf node. A surrogate rule is a backup to
the main splitting rule. When the main splitting rule relies on an input whose value is missing, the
first surrogate rule is invoked. For more information, see
Missing Values
in the Detail section.
Note: The option to save surrogate rules in each node is often used by advocates of CART.
Default:
0
OUTAFDS=SAS-data-set
Names the output data set that is to contain a tree description suitable for inputting data into
SAS/AF widgets such as ORGCHART and TREERING.
Definition:
A SAS/AF Widget is a visible part of a window, which can be treated as a
separate, isolated entity. For example, a SAS/AF Widget can be a scrollbar, a
text field, a pushbutton, and so on. It is an individual component of the user
interface.
OUTLEAF=SAS-data-set
Names the output data set that contains statistics for each leaf node.
OUTMATRIX=SAS-data-set
Names the output data set that contains tree summary statistics. For nominal targets, the summary
statistics consist of the counts and proportions of observations correctly classified. For interval
targets, the summary statistics include the average squared prediction error and R-squared, which
equals
OUTSEQ=SAS-data-set
Names the output data set that contains statistics on each sub-tree in the sub-tree sequence.
OUTTREE=SAS-data-set
Names the output data set that contains all the tree information. This data set can then be used on
subsequent executions of PROC SPLIT.
PADJUST=methods
Names methods of adjusting the p-values used with the PROBCHISQ and PROBFTEST criteria.
Possible methods are:
KASSAFTER
Bonferroni adjustment applied after split is chosen.
KASSBEFORE
Bonferroni adjustment applied before split is chosen.
DEVILLE
Adjustment independent of number of branches in split.
DEPTH
Adjustment for number of ancestor splits.
NOGABRIEL
Turns off adjustment that sometimes overrides KASS.
NONE
No adjustment is made.
Caution:
This option is ignored unless CRITERION= PROBCHISQ or PROBFTEST.
PVARS=n
Specifies the number of inputs to consider uncorrelated when adjusting p-values for the number of
inputs.
SPLITSIZE=n
Specifies the smallest number of training observations a node must have for PROC SPLIT to
consider splitting it.
Range:
Maximum is 32767 on most machines.
Default:
The greater of either 50 or the total number of cases in the training data set
divided by 100.
SUBTREE=method
Specifies how to construct the sub-tree in terms of selection methods. The following methods are