Methods of correcting for multiple testing: operating characteristics

Stat Med. 1997 Nov 30;16(22):2511-28. doi: 10.1002/(sici)1097-0258(19971130)16:22<2511::aid-sim693>;2-4.


We examine the operating characteristics of 17 methods for correcting p-values for multiple testing on synthetic data with known statistical properties. These methods are derived p-values only and not the raw data. With the test cases, we systematically varied the number of p-values, the proportion of false null hypotheses, the probability that a false null hypothesis would result in a p-value less than 5 per cent and the degree of correlation between p-values. We examined the effect of each of these factors on family-wise and false negative error rates and compared the false negative error rates of methods with an acceptable family-wise error. Only four methods were not bettered in this comparison. Unfortunately, however, a uniformly best method of those examined does not exist. A suggested strategy for examining corrections uses a succession of methods that are increasingly lax in family-wise error. A computer program for these corrections is available.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Clinical Trials as Topic / methods*
  • Confidence Intervals
  • Data Interpretation, Statistical
  • Humans
  • Multivariate Analysis
  • Regression Analysis
  • Statistics as Topic / methods*
  • Statistics, Nonparametric