Bibliography on Cluster Analysis
Warren S. Sarle <[email protected]>
Originally published in the _SAS/STAT User's Guide_, 1990
Revised Sep 14, 1997

The clustering literature contains a vast number of useless
publications.  This bibliography is intended to concentrate on the more
useful ones.  Massart and Kaufman (1983) is the best elementary
introduction to cluster analysis.  Other important texts are Anderberg
(1973), Sneath and Sokal (1973), Duran and Odell (1974), Hartigan
(1975), Titterington, Smith, and Makov (1985), McLachlan and Basford
(1988), and Kaufmann and Rousseeuw (1990).  Hartigan (1975) and Spath
(1980) give numerous FORTRAN programs for clustering.  Any prospective
user of cluster analysis should study the Monte Carlo results of
Milligan (1980), Milligan and Cooper (1985), and Cooper and Milligan
(1984).  Important references on the statistical aspects of clustering
include MacQueen (1967), Wolfe (1970), Scott and Symons (1971), Hartigan
(1977; 1978; 1981; 1985), Symons (1981), Everitt (1981), Sarle (1983),
Bock (1985), and Thode et al. (1988).  Bayesian methods have important
advantages over maximum likelihhod; see Binder (1978; 1981), Banfield
and Raftery (1993), and Bensmail et al, (1997). For fuzzy clustering,
see Bezdek (1981) and Bezdek and Pal (1992).  The signal-processing
perspective is provided by Gersho and Gray (1992). See Blashfield and
Aldenderfer (1978) for a discussion of the fragmented state of the
literature on cluster analysis.  Avoid articles in the Journal of
Marketing Research.  There is a separate list of references at the end
on nonparametric clustering methods, which define a cluster as a mode in
the probability density function; these nonparametric methods have major
advantages over all traditional methods.

Anderberg, M.R. (1973), _Cluster Analysis for Applications_, New
York: Academic Press, Inc.

Art, D., Gnanadesikan, R., and Kettenring, R. (1982), "Data-based
Metrics for Cluster Analysis," Utilitas Mathematica,
21A, 75-99.

Banfield, J.D. and Raftery, A.E. (1993). "Model-Based Gaussian and Non-
Gaussian Clustering", Biometrics, 49, 803-821.

Bensmail, H., Celeux, G., Raftery, A.E., and Robert, C.P. (1997),
"Inference in model-based cluster analysis," Statistics and Computing,
7, 1-10.

Bezdek, J.C. (1981), _Pattern Recognition with Fuzzy Objective Function
Algorithms_, New York: Plenum Press.

Bezdek, J.C. & Pal, S.K., eds. (1992), _Fuzzy Models for Pattern
Recognition_, New York: IEEE Press.

Binder, D.A. (1978), "Bayesian Cluster Analysis," Biometrika, 65,
31-38.

Binder, D.A. (1981), "Approximations to Bayesian Clustering Rules,"
Biometrika, 68, 275-285.

Blashfield, R.K. and Aldenderfer, M.S. (1978), "The Literature on
Cluster Analysis," Multivariate Behavioral Research,
13, 271-295.

Bock, H.H. (1985), "On Some Significance Tests in Cluster Analysis,"
Journal of Classification, 2, 77-108.

Calinski, T. and Harabasz, J. (1974), "A Dendrite Method for Cluster
Analysis," Communications in Statistics, 3, 1-27.

Cooper, M.C. and Milligan, G.W. (1988), "The Effect of
Error on Determining the Number of Clusters," Proceedings
of the International Workshop on Data Analysis, Decision
Support and Expert Knowledge Representation in Marketing and
Related Areas of Research, 319-328.

Duda, R.O. and Hart, P.E. (1973), _Pattern Classification and Scene
Analysis_, New York: John Wiley & Sons, Inc.

Duran, B.S. and Odell, P.L. (1974), _Cluster Analysis_, New York:
Springer-Verlag.

Englemann, L. and Hartigan, J.A. (1969), "Percentage Points of a Test
for Clusters," Journal of the American Statistical Association,
64, 1647-1648.

Everitt, B.S. (1979), "Unresolved Problems in Cluster Analysis,"
Biometrics, 35, 169-181.

Everitt, B.S. (1981), "A Monte Carlo investigation of the likelihood
ratio test for the number of components in a mixture of normal
distributions," Multivariate Behavioral Research, 16, 171-80.

Everitt, B.S. and Hand, D.J. (1981), _Finite Mixture Distributions_,
New York: Chapman and Hall.

Gersho, A. and Gray, R.M. (1992), _Vector Quantization and Signal
Compression_, Boston: Kluwer Academic Publishers.

Good, I.J. (1977), "The Botryology of Botryology," in Classification
and Clustering, ed. J. Van Ryzin, New York: Academic Press, Inc.

Hartigan, J.A. (1975), _Clustering Algorithms_, New York: John
Wiley & Sons, Inc.

Hartigan, J.A. (1977), "Distribution Problems in Clustering," in
Classification and Clustering, ed. J. Van Ryzin, New York:
Academic Press, Inc.

Hartigan, J.A. (1978), "Asymptotic Distributions for Clustering
Criteria,"Annals of Statistics, 6, 117-131.

Hartigan, J.A. (1981), "Consistency of Single Linkage for High-Density
Clusters," Journal of the American Statistical Association, 76,
388-394.

Hartigan, J.A. (1985), "Statistical Theory in Clustering,"
Journal of Classification, 2, 63-76.

Hathaway, R.J. (1985), "A constrained formulation of maximum-likelihood
estimation for normal mixture distributions," Annals of Statistics, 13,
795-800.

Hawkins, D.M., Muller, M.W., and ten Krooden, J.A. (1982), "Cluster
Analysis," in Topics in Applied Multivariate Analysis, ed. D.M.
Hawkins, Cambridge: Cambridge University Press.

Hubert, L. (1974), "Approximate Evaluation Techniques for the
Single-Link and Complete-Link Hierarchical Clustering Procedures,"
Journal of the American Statistical Association, 69,
698-704.

Hubert, L.J. and Baker, F.B. (1977), "An Empirical Comparison of
Baseline Models for Goodness-of-Fit in r-Diameter Hierarchical
Clustering," in Classification and Clustering, ed. J. Van Ryzin,
New York: Academic Press, Inc.

Kaufmann, L. and Rousseeuw, P.J. (1990), _Finding Groups in Data_,
New York: John Wiley & Sons, Inc.

Lee, K.L. (1979), "Multivariate Tests for Clusters," Journal of the
American Statistical Association, 74, 708-714.

Lindsay, B.G., and Basak, P. (1993), "Multivariate normal mixtures:
A fast consistent method of moments," Journal of the American
Statistical Association, 88, 468-476.

Ling, R.F (1973), "A Probability Theory of Cluster Analysis," Journal
of the American Statistical Association, 68, 159-169.

MacQueen, J.B. (1967), "Some Methods for Classification and Analysis of
Multivariate Observations,"Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability,
1, 281-297.

Marriott, F.H.C. (1971), "Practical Problems in a Method of Cluster
Analysis,"Biometrics, 27, 501-514.

Marriott, F.H.C. (1975), "Separating Mixtures of Normal Distributions,"
Biometrics, 31, 767-769.

Massart, D.L. and Kaufman, L. (1983), _The Interpretation of
Analytical Chemical Data by the Use of Cluster Analysis_, New York:
John Wiley & Sons, Inc.

McClain, J.O. and Rao, V.R. (1975), "CLUSTISZ: A Program to Test for the
Quality of Clustering of a Set of Objects," Journal of Marketing
Research, 12, 456-460.

McLachlan, G.J. and Basford, K.E. (1988), _Mixture Models_,
New York: Marcel Dekker, Inc.

Mezzich, J.E and Solomon, H. (1980), _Taxonomy and Behavioral
Science_, New York: Academic Press, Inc.

Milligan, G.W. (1980), "An Examination of the Effect of Six Types of
Error Perturbation on Fifteen Clustering Algorithms,"
Psychometrika, 45, 325-342.

Milligan, G.W. (1981), "A Review of Monte Carlo Tests of Cluster
Analysis," Multivariate Behavioral Research, 16, 379-407.

Milligan, G.W. and Cooper, M.C. (1985), "An Examination of Procedures
for Determining the Number of Clusters in a Data Set,"
Psychometrika, 50, 159-179.

Pollard, D. (1981), "Strong Consistency of k-Means Clustering,"
Annals of Statistics, 9, 135-140.

Priebe, C.E. (1994), "Adaptive mixtures," Journal of the American
Statistical Association, 89, 796-806.

Sarle, W.S. (1982), "Cluster Analysis by Least Squares," Proceedings of
the Seventh Annual SAS Users Group International Conference,
651-653.

Sarle, W.S. (1983), _Cubic Clustering Criterion_, SAS Technical
Report A-108, Cary, NC: SAS Institute Inc.

Scott, A.J. and Symons, M.J. (1971), "Clustering Methods Based on
Likelihood Ratio Criteria," Biometrics, 27, 387-397.

Sneath, P.H.A. and Sokal, R.R. (1973), _Numerical Taxonomy_, San
Francisco: W.H. Freeman.

Spath, H. (1980), _Cluster Analysis Algorithms_, Chichester,
England: Ellis Horwood.

Symons, M.J. (1981), "Clustering Criteria and Multivariate Normal
Mixtures," Biometrics, 37, 35-43.

Thode, H.C.Jr., Mendell, N.R., and Finch, S.J. (1988), "Simulated
percentage points for the null distribution of the likelihood ratio 
test for a mixture of two normals," Biometrics, 44, 1195-1201.

Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985),
_Statistical Analysis of Finite Mixture Distributions_,
New York: John Wiley & Sons, Inc.

Vuong, Q.H. (1989), "Likelihood ratio tests for model selection and
non-nested hypotheses," Econometrica, 57, 307-333.

Ward, J.H. (1963), "Hierarchical Grouping to Optimize an Objective
Function," Journal of the American Statistical Association, 58,
236-244.

Wolfe, J.H. (1970), "Pattern Clustering by Multivariate Mixture
Analysis," Multivariate Behavioral Research, 5, 329-350.

Wolfe, J.H. (1978), "Comparative Cluster Analysis of Patterns of
Vocational Interest," Multivariate Behavioral Research,
13, 33-44.

*************************************************************************

More references for nonparametric estimation of clusters as modes:

Barnett, V., ed. (1981), _Interpreting Multivariate Data_, New York:
John Wiley & Sons, Inc.

Girman, C.J. (1994), "Cluster Analysis and Classification Tree
Methodology as an Aid to Improve Understanding of Benign Prostatic
Hyperplasia," Ph.D. thesis, Chapel Hill, NC: Department of
Biostatistics, University of North Carolina.

Gitman, I. (1973), "An Algorithm for Nonsupervised Pattern
Classification," IEEE Transactions on Systems, Man, and Cybernetics,
SMC-3, 66-74.

Hartigan, J.A. and Hartigan, P.M. (1985), "The Dip Test of
Unimodality," Annals of Statistics_ 13, 70-84.

Hartigan, P.M. (1985), "Computation of the Dip Statistic to Test for
Unimodality,"  Applied Statistics, 34, 320-325.

Huizinga, D. H. (1978), "A Natural or Mode Seeking Cluster Analysis
Algorithm," Technical Report 78-1, Behavioral Research Institute, 2305
Canyon Blvd., Boulder, Colorado 80302.

Koontz, W.L.G. and Fukunaga, K. (1972a), "A Nonparametric
Valley-Seeking Technique for Cluster Analysis," IEEE Transactions on
Computers, C-21, 171-178.

Koontz, W.L.G. and Fukunaga, K. (1972b), "Asymptotic Analysis of a
Nonparametric Clustering Technique," IEEE Transactions on Computers,
C-21, 967-974.

Koontz, W.L.G., Narendra, P.M., and Fukunaga, K. (1976), "A
Graph-Theoretic Approach to Nonparametric Cluster Analysis," IEEE
Transactions on Computers, C-25, 936-944.

Minnotte, M.C. (1992), "A Test of Mode Existence with
Applications to Multimodality," Ph.D. thesis, Rice University,
Department of Statistics.

Mizoguchi, R. and Shimura, M. (1980), "A Nonparametric Algorithm for
Detecting Clusters Using Hierarchical Structure," IEEE Transactions on
Pattern Analysis and Machine Intelligence, PAMI-2, 292-300.

Mueller, D.W. and Sawitzki, G. (1991), "Excess mass estimates and tests
for multimodality," JASA 86, 738-746.

Polonik, W. (1993), "Measuring Mass Concentrations and Estimating
Density Contour Clusters--An Excess Mass Approach," Technical Report,
Beitraege zur Statistik Nr. 7, Universitaet Heidelberg.

SAS Institute Inc. (1993), _SAS/STAT Software: The MODECLUS Procedure_,
SAS Technical Report P-256, Cary, NC: SAS Institute Inc.

Silverman, B.W. (1986), _Density Estimation_, New York: Chapman and
Hall.

Tukey, P.A. and Tukey, J.W. (1981), "Data-Driven View Selection;
Agglomeration and Sharpening," in Barnett (1981).

Wong, M.A. (1982), "A Hybrid Clustering Method for Identifying
High-Density Clusters," Journal of the American Statistical
Association, 77, 841-847.

Wong, M.A. and Lane, T. (1983), "A _k_th Nearest Neighbor Clustering
Procedure," _Journal of the Royal Statistical Society_, Series B, 45,
362-368.

Wong, M.A. and Schaack, C. (1982), "Using the _k_th Nearest Neighbor
Clustering Procedure to Determine the Number of Subpopulations,"
_American Statistical Association 1982 Proceedings of the Statistical
Computing Section_, 40-48.