Reduction in squared error from node means.
p-value of F-test associated with node variance. Default for INTERVAL.
Specifies that missing values be excluded during a split search.
Specifies the most number of candidate splits to find in an exhaustive search. If more candidates
would have to be considered, a heuristic search is used instead. The EXHAUSTIVE option applies
to multi-way splits, and for binary splits on nominal targets with more than 2 values.
The default value is 5000.
Requests that the tree created by PROC DMSPLIT be input to PROC SPLIT. The tree is expected
in the DMDBCAT= catalog. The DMDBCAT= option is required, and the INDMTREE and
INTREE= options are prohibited.
Names a data set created from the PROC SPLIT OUTTREE= option.
When using the INTREE option, the IN, TARGET, and FREQ statements are
prohibited, as are the DECISION and PRIORS statements.
Specifies the smallest number of training observations a node can have.
Specifies the proportion of observations to use with ASSESS=LIFT.
Restricts the number of subsets a splitting rule can produce to n or fewer. A value of 2 results in
2 - 100
Specifies the maximum number of generations of nodes. The original node, generation 0, is called
the root node. The children of the root node are the first generation. PROC SPLIT will only
consider splitting nodes in the nth generation when n is less than the value of depth.
Specifies the within node sample size used for finding splits. If the number of training
observations in a node is larger than n, then the split search for that node is based on a random
sample of size n.
1 n 32767
Specifies how many splitting rules are saved with each node. The tree only uses one rule. The
remaining rules are saved for comparison. Based on the criterion you selected, you can see how
well the variable that was used split the data, and how well the next n-1 would have split the data.
Specifies a number of surrogate rules sought in each non-leaf node. A surrogate rule is a backup to
the main splitting rule. When the main splitting rule relies on an input whose value is missing, the
first surrogate rule is invoked. For more information, see
in the Detail section.
Note: The option to save surrogate rules in each node is often used by advocates of CART.
Names the output data set that is to contain a tree description suitable for inputting data into
SAS/AF widgets such as ORGCHART and TREERING.
A SAS/AF Widget is a visible part of a window, which can be treated as a
separate, isolated entity. For example, a SAS/AF Widget can be a scrollbar, a
text field, a pushbutton, and so on. It is an individual component of the user
Names the output data set that contains statistics for each leaf node.
Names the output data set that contains tree summary statistics. For nominal targets, the summary
statistics consist of the counts and proportions of observations correctly classified. For interval
targets, the summary statistics include the average squared prediction error and R-squared, which
Names the output data set that contains statistics on each sub-tree in the sub-tree sequence.
Names the output data set that contains all the tree information. This data set can then be used on
subsequent executions of PROC SPLIT.
Names methods of adjusting the p-values used with the PROBCHISQ and PROBFTEST criteria.
Possible methods are:
Bonferroni adjustment applied after split is chosen.
Bonferroni adjustment applied before split is chosen.
Turns off adjustment that sometimes overrides KASS.
This option is ignored unless CRITERION= PROBCHISQ or PROBFTEST.
Specifies the number of inputs to consider uncorrelated when adjusting p-values for the number of
Specifies the smallest number of training observations a node must have for PROC SPLIT to
consider splitting it.
Maximum is 32767 on most machines.
The greater of either 50 or the total number of cases in the training data set
divided by 100.
Specifies how to construct the sub-tree in terms of selection methods. The following methods are