*The STDIZE Procedure*
**Overview**
The STDIZE procedure standardizes one or more numeric variables in a SAS data set by subtracting a

location measure and dividing by a scale measure. A variety of location and scale measures are provided,

including estimates that are resistant to outliers and clustering (see the METHOD= option). You can also

multiply each standardized value by a constant and add a constant. Thus the result is:

where:

result

is the final output value

adder

is the constant to add (the value specified in the ADD= option)

multiplier

is the constant to multiply by (the value specified in the MULT= option)

original

is the original input value

location

is the location measure

scale

is the scale measure

PROC STDIZE also finds quantiles in one pass of the data. It is especially useful when the data set is

very large and PROC UNIVARIATE may either run out of memory or take a long time to compute the

quantiles.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

*The STDIZE Procedure*
**Procedure Syntax**
**PROC STDIZE** <

*option(s)*>;

**BY** *variable-1* <...

*variable-n*>

;

**FREQ** *variable*;

**LOCATION** *variable(s)*;

**SCALE** *variable(s)*;

**VAR** *variable(s)*;

**WEIGHT** *variable*;
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

*The STDIZE Procedure*
**PROC STDIZE Statement**
**Invokes the STDIZE procedure.**
**PROC STDIZE** <

*option(s)*>;

**Options**
**ADD=***number*
Specifies the constant to add to each value after standardizing and multiplying by the MULT=

number.

**Default:**

0

**DATA=*** SAS-data-set*

Specifies the input data source to be standardized.

**Default:**

_LAST_

**FUZZ=***c*
Specifies the relative fuzz factor for writing the output.

**Default:**
1E-14.

For OUT= data set: if

,

then **result**= 0.

r

For OUTSTAT= data set: if

,

then SCALE= 0;

otherwise, if

,

then LOCATION=0.

r

**INITIAL=***method-name*
Specifies the method for computing initial estimates for the A estimates: ABW, AWAVE, and

AHUBER. See the

Table of Methods for Computing Location and Scale Measures

for the list of

methods.

**CAUTION:**
**ABW, AWAVE, AHUBER, and IN are not valid as INITIAL methods.**
**Default:**
MAD

**METHOD=***method-name*
Specifies the name of the standardization method. See

Standardization Methods

section for more

information on the **method-names** that are available for computing LOCATION and SCALE

measures.

**Default:**
STD

**MISSING=***method*** | ***numeric-value***/<***missing-option(s)***>**
Specifies the method or a numeric value for replacing missing values.

Use the MISSING= option when you want to replace missing values by something other

than the location measure associated with the METHOD= option, which is what the

REPLACE option uses as the replacement value. The usual methods include MEAN,

MEDIAN, and MIDRANGE. Any of the values for the METHOD= option can also be

specified for the MISSING= option, and the corresponding location measure will be used to

replace missing values. If a numeric value is given, it replaces missing values after

standardizing the data. However, the REPONLY option can be used together with the

MISSING= option to suppress standardization in case you only want to replace missing

values.

r

See the

Table of Methods for Computing Location and Scale Measures

for a list of the

values that can be specified for the MISSING= option (with the exception of

MISSING=IN).

r

**MULT=***c*
Specifies the constant to multiply each value by, after standardizing.

**Default:**
1

**NMARKERS=***n*

Specifies the number of markers for the P2 algorithm (PCTLMTD=P2).

**Range:**

Integer where **n** 5).

**Default:**
101

**NOMISS**
Omits observations that have missing values in the analyzed variables from computation of the

location and scale measures. Otherwise, all nonmissing values are used.

**NORM**

For METHOD= AGK, IQR, MAD, or SPACING, normalizes the scale estimator to be consistent

for the standard deviation of a normal distribution.

**OUT=*** SAS-data-set*

Specifies the output data set created by PROC STDIZE. The output data set is a copy of the

DATA= data set except that the analyzed variables (those in the VAR statement, or in the absence

of a VAR statement, all numeric variables not listed in any other statement) have been

standardized.

**Default:**

_DATA_. If the OUT= option is omitted, PROC STDIZE creates an output

data set and names it according to the DATA**n** convention, just as if you had

omitted a data set name in a DATA statement.

**OUTSTAT=*** SAS-data-set*
Specifies the output statistics data set that contains the location and scale measures and some other

simple statistics. A _TYPE_ variable is also created to help identify the type of statistics for each

observation. The value of the _TYPE_ variable can be:

LOCATION

Contains the location measure of each variable.

SCALE

Contains the scale measure of each variable.

NORM

Contains the norm measure of each variable.

ADD

Contains the constant from the ADD= option.

MULT

Contains the constant from the MULT= option.

N

Contains the total number of non-missing positive frequencies of each variable.

P

**n**
Contains the percentiles of each variable specified through the PCTLPTS= option.

**Range:**
0 *n* 100

**PCTLDEF=***value*
Specifies one of the five available definitions described in the **Computational Methods** section in

the UNIVARIATE procedure that calculates percentiles when PCTLMTD=ORD_STAT is

specified.

**Default:**
5.

**Range:**

1, 2, 3, 4, 5

**Tip:**

When PCTLMTD=P2, the value of PCTLDEF is always 5.

**PCTLMTD=*** method*
Specifies the method used to estimate percentiles.

ORD_STAT