Overcoming the winner's curse: estimating penetrance parameters from case-control data

Am J Hum Genet. 2007 Apr;80(4):605-15. doi: 10.1086/512821. Epub 2007 Feb 16.


Genomewide association studies are now a widely used approach in the search for loci that affect complex traits. After detection of significant association, estimates of penetrance and allele-frequency parameters for the associated variant indicate the importance of that variant and facilitate the planning of replication studies. However, when these estimates are based on the original data used to detect the variant, the results are affected by an ascertainment bias known as the "winner's curse." The actual genetic effect is typically smaller than its estimate. This overestimation of the genetic effect may cause replication studies to fail because the necessary sample size is underestimated. Here, we present an approach that corrects for the ascertainment bias and generates an estimate of the frequency of a variant and its penetrance parameters. The method produces a point estimate and confidence region for the parameter estimates. We study the performance of this method using simulated data sets and show that it is possible to greatly reduce the bias in the parameter estimates, even when the original association study had low power. The uncertainty of the estimate decreases with increasing sample size, independent of the power of the original test for association. Finally, we show that application of the method to case-control data can improve the design of replication studies considerably.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Bias*
  • Computer Simulation
  • Epidemiologic Research Design*
  • Gene Frequency / genetics*
  • Humans
  • Penetrance*
  • Reproducibility of Results
  • Sample Size