Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments

Proc Natl Acad Sci U S A. 2001 Jul 31;98(16):8961-5. doi: 10.1073/pnas.161273698. Epub 2001 Jul 24.


We introduce a general technique for making statistical inference from clustering tools applied to gene expression microarray data. The approach utilizes an analysis of variance model to achieve normalization and estimate differential expression of genes across multiple conditions. Statistical inference is based on the application of a randomization technique, bootstrapping. Bootstrapping has previously been used to obtain confidence intervals for estimates of differential expression for individual genes. Here we apply bootstrapping to assess the stability of results from a cluster analysis. We illustrate the technique with a publicly available data set and draw conclusions about the reliability of clustering results in light of variation in the data. The bootstrapping procedure relies on experimental replication. We discuss the implications of replication and good design in microarray experiments.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Analysis of Variance
  • Cluster Analysis*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis*