Cluster analysis and related techniques in medical research

Stat Methods Med Res. 1992;1(1):27-48. doi: 10.1177/096228029200100103.


In this paper we review methods of cluster analysis in the context of classifying patients on the basis of clinical and/or laboratory type observations. Both hierarchical and non-hierarchical methods of clustering are considered, although the emphasis is on the latter type, with particular attention devoted to the mixture likelihood-based approach. For the purposes of dividing a given data set into g clusters, this approach fits a mixture model of g components, using the method of maximum likelihood. It thus provides a sound statistical basis for clustering. The important but difficult question of how many clusters are there in the data can be addressed within the framework of standard statistical theory, although theoretical and computational difficulties still remain. Two case studies, involving the cluster analysis of some haemophilia and diabetes data respectively, are reported to demonstrate the mixture likelihood-based approach to clustering.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Diabetes Mellitus / classification
  • Female
  • Hemophilia A / classification
  • Humans
  • Likelihood Functions
  • Research Design*