*The EMCLUS Procedure*
**Output from PROC EMCLUS**
The beginning of the output shows the initial model parameter estimates. Next, the estimated model

parameters, sample means, and sample variances for the active primary clusters are displayed. In active

clusters are shown with missing values. The sample mean and variance are calculated from the

observations that are summarized in the primary clusters.

In the cluster summary table, the following statistics are listed:

**Current Frequency**
the number of observations that are summarized in a cluster during the current iteration.

**Total Frequency**
the cumulative sum of the current frequencies for each cluster.

**Proportion of Data Summarized**
the total frequency divided by the **Obs read in**.

**Nearest Cluster**
the closest primary cluster to a primary cluster based on the euclidean distance between the

estimated mean of the two primary clusters.

**Distance**

the euclidean distance of a primary cluster to its nearest cluster.

The iteration summary table displays:

**Log-likelihood**

the average log-likehood over all the observations that are read in.

**Obs read in this iteration**
the number of observations that are read in at current iteration.

**Obs read in**
the cumulative sum of observations that are read in.

**Current Summarized**
is the sum of the current frequencies across the primary clusters.

**Total Summarized**
is the sum of the total frequencies across the primary clusters.

**Proportion Summarized**
the** Total Summarized** divided by the **Obs read in**.

If there are secondary clusters, the sample mean, sample variance, and the number of observations in

secondary clusters are also displayed after the iteration summary table.

**Note:** The estimated variance parameter for each variable is bound from below by the value

(var)*(eps), where var is the sample variance of that variable obtained from the observations read in at

the first iteration, and eps is 10

--6

. Both the standard and scaled EM algorithm sometimes are slow to

convergence, however, the scaled EM algorithm generally runs faster than the standard EM algorithm.

Convergence may be sped up by increasing p and /or eps, or by using the CLEAR option. Changing

these values may alter the parameter estimates.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

*The EMCLUS Procedure*
**Example**
**Example 1: Syntax for PROC FASTCLUS**
**Example 2: Use of the EMCLUS Procedure**
Chapter Contents

Previous

Next

Top of Page

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

*The EMCLUS Procedure*
**Example 1: Syntax for PROC FASTCLUS**
**PROC EMCLUS**libref.SAS-data-set>
OUTSEEDS = *libref.SAS-data-set*

MAXCLUSTERS = *positive integer*;

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

*The EMCLUS Procedure*
**Example 2: Use of the EMCLUS Procedure**
PROC FASTCLUS returns a portion of the total output summarized in the following table:

Cluster

Frequency

RMS

Std.

Deviation

1

500

101.34

2

3

1000.34

3

100

3.79

4

150

4.05

Clusters 1, 3, and 4 should be used as initial estimates for PROC EMCLUS, because of their high

frequency counts. Cluster 1 may actually be a group of clusters because of its high RMS Std. Deviation.

Therefore, the syntax when using PROC EMCLUS could look like:

PROC EMCLUS DATA=

CLUSTERS = 5

INIT = FASTCLUS

SEED =

INITSTD = 50.0

INITCLUS 1, 3, 4;

run;

Note that CLUSTERS is set to 5, but any integer greater than or equal to 3 is appropriate since there are

three clusters specified in the INITCLUS option. Also the INITSTD could have been set to any number

less than 101.34 and greater than 4.05.

Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

COMBINATION FUNCTIONS

**Details**
**Examples**
Example 1: Developing a Simple Multilayer Perceptron (Rings Data)

Example 2: Developing a Neural Network for a Continuous Target

Example 3: Neural Network Hill-and-Plateau Example (Surf Data)

**References**
Copyright 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.