A review of clustering methodology is presented,
with emphasis on algorithm performance and the resulting
implications for applied research. After an overview
of the clustering literature, the clustering process
is discussed within a seven-step framework. The four
major types of clustering methods can be characterized
as hierarchical, partitioning, overlapping, and ordination
algorithms. The validation of such algorithms refers
to the problem of determining the ability of the
methods to recover cluster configurations which are
known to exist in the data. Validation approaches include
mathematical derivations, analyses of empirical
datasets, and monte carlo simulation methods. Next,
interpretation and inference procedures in cluster analysis
are discussed. inference procedures involve testing
for significant cluster structure and the problem of
determining the number of clusters in the data. The
paper concludes with two sets of recommendations.
One set deals with topics in clustering that would benefit
from continued research into the methodology.
The other set offers recommendations for applied analyses
within the framework of the clustering process.