A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification

Stat Med. 2007 Dec 20;26(29):5320-34. doi: 10.1002/sim.2968.

Abstract

This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications.

Publication types

  • Comparative Study

MeSH terms

  • Artificial Intelligence
  • Bias
  • Biomarkers, Tumor / classification
  • Biomarkers, Tumor / genetics
  • Cluster Analysis
  • Computer Simulation
  • Confidence Intervals
  • Data Interpretation, Statistical
  • Diagnosis, Computer-Assisted / methods
  • Discriminant Analysis*
  • Gene Expression Profiling / statistics & numerical data
  • Genetic Predisposition to Disease
  • Genetic Testing / methods
  • Humans
  • Logistic Models
  • Lymphoma / diagnosis
  • Lymphoma / genetics
  • Lymphoma / metabolism
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Reproducibility of Results
  • Sample Size
  • Sampling Studies
  • Sensitivity and Specificity
  • Statistics, Nonparametric

Substances

  • Biomarkers, Tumor