entire data be stored in memory. The accuracy in estimating the quantiles are comparable for both
methods. The default is PCTLMTD=ORD_STAT if enough memory is available; otherwise,
Missing values can be replaced by the LOCATION measure or by any specified constant (see the
REPLACE option and the MISSING= option). You can also suppress standardization if you only want to
replace missing values (see the REPONLY option).
If the NOMISS option is used, PROC STDIZE omits observations that have any missing values in the
analyzed variables from computation of the location and scale measures. Otherwise, all nonmissing
values are used.
Output Data Sets
The output data set is a copy of the DATA= data set except that the analyzed variables (those in the
VAR statement, or if there is no VAR statement, all numeric variables not listed in any other statement)
have been standardized.
The new data set contains the following variables:
the BY variables, if any;
numeric variables not listed in any other statement.
Each observation in the new data set contains some type of statistic as indicated by the _TYPE_ variable.
Scale measure of each
Constant from ADD=.
This value is the same
for each variable.
the same for each
Total number of
frequencies of each
produced only if either
the NORM option is
IQR, MAD, or
SPACING or when the
SNORM option is
Percentiles of each
variable specified by
PCTLPTS= where n is
any real number such
that 0 n 100.
If you specify the PSTAT option, PROC STDIZE displays the following statistics for each variable:
Name: the name of the variable
The formula for Unstandardization is based upon the location and scale measures and the constants for
addition and multiplication. All of these are identified by the _TYPE_ variable in the SAS-data-set.
The SAS-data-set must have a _TYPE_ variable that contains the following observations: a _TYPE_=
LOCATION observation and a _TYPE_=SCALE observation. _TYPE_=ADD, and _TYPE_=MULT are
optional observations; if they are not found in the SAS-data-set, the constants specified in the ADD= and
MULT= options (or their default values) are used for unstandardization. See OUTSTAT= for details
about the kind of statistics represented by each value of _TYPE_.
The formula for unstandardization is:
is the value obtained from the previous standardization
is the constant to add (the value found in the _TYPE_ variable of the SAS-data-set or specified in
is the constant to multiply by (the value found in the _TYPE_ variable or specified in the MULT=
is the original input value
is the location measure
is the scale measure
The following examples were executed using the HP-UX version 10.20 operating system and the SAS
software release 6.12TS045.
Example 1: Getting Started with the STDIZE Procedure
Example 2: Unstandardizing a Data Set
Example 3: Replacing Missing Values with Standardizing
Example 4: Replacing Missing Values without Standardizing the Variables
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.
Setting the Method= Standardization Statistic.
Standardizing Observations using BY-Group Processing
Outputting the OUT= Standardized data set
Outputting the OUTSTAT= Summary Statistic data set.
This example demonstrates how to center numeric variables by their medians with the STDIZE procedure.
Observations in the input data set are standardized separately in groups for each level of the binary target.
The example uses a fictitious mortgage data set named SAMPSIO.HMEQ, which contains 5,960 cases. It is stored
in the sample library. Each case represents an applicant for a home equity loan. All applicants have an existing
mortgage. The binary target BAD indicates whether or not an applicant eventually defaulted or was ever seriously
proc sort data=sampsio.hmeq out=hmeq;
proc stdize data=hmeq
var mortdue value yoj derog delinq
clage ninq clno debtinc;
title2 'For Each Level of the Target BAD';