CLUST_WTS

The CLUST_WTS function computes the weights (the cluster centers) of an n-column, m-row array, where n is the number of variables and m is the number of observations or samples. CLUST_WTS uses k-means clustering. With this technique, CLUST_WTS starts with k random clusters and then iteratively moves items between clusters, minimizing variability within each cluster and maximizing variability between clusters.

Note: Because the initial clusters are chosen randomly, your results may differ slightly each time the CLUST_WTS routine is invoked, even for the same input data. For data with well-defined clusters the differences should be slight. For randomly-scattered data (no distinguishable clusters), the results may be significantly different, which may indicate that k-means clustering is not appropriate for your data.

Tip: For hierarchical tree clustering, see the CLUSTER_TREE function.

For more information on cluster analysis, see:

Everitt, Brian S. Cluster Analysis. New York: Halsted Press, 1993. ISBN 0-470-22043-0

Examples

See CLUSTER.

Syntax

Result = CLUST_WTS( Array [, /DOUBLE] [, N_CLUSTERS=value] [, N_ITERATIONS=integer] [, VARIABLE_WTS=vector] )

Return Value

Returns an m-column, N_CLUSTERS-row array of cluster centers by computing the weights (the cluster centers) of an n-column, m-row array, where n is the number of variables and m is the number of observations or samples.

Arguments

Array

An n-column, m-row array of any data type except string, single- or double-precision complex.

Keywords

DOUBLE

Set this keyword to force the computation to be done in double-precision arithmetic.

N_CLUSTERS

Set this keyword equal to the number of cluster centers. The default is to compute m cluster centers.

N_ITERATIONS

Set this keyword equal to the number of iterations used when in computing the cluster centers. The default is to use 20 iterations.

VARIABLE_WTS

Set this keyword equal to an m-element vector of floating-point variable weights. The elements of this vector are used to give greater or lesser importance to each variable (each column) in determining the cluster centers. The default is to give all variables equal weighting using a value of 1.0.

Version History

5.0	Introduced

Product	IDL

Version	9.2