BR-squared: a practical solution to the winner's curse in genome-wide scans

Hum Genet. 2011 May;129(5):545-52. doi: 10.1007/s00439-011-0948-2. Epub 2011 Jan 19.

Abstract

The detrimental effects of the winner's curse, including overestimation of the genetic effects of associated variants and underestimation of sufficient sample sizes for replication studies are well-recognized in genome-wide association studies (GWAS). These effects can be expected to worsen as the field moves from GWAS into whole genome sequencing. To date, few studies have reported statistical adjustments to the naive estimates, due to the lack of suitable statistical methods and computational tools. We have developed an efficient genome-wide non-parametric method that explicitly accounts for the threshold, ranking, and allele frequency effects in whole genome scans. Here, we implement the method to provide bias-reduced estimates via bootstrap re-sampling (BR-squared) for association studies of both disease status and quantitative traits, and we report the results of applying BR-squared to GWAS of psoriasis and HbA1c. We observed over 50% reduction in the genetic effect size estimation for many associated SNPs. This translates into a greater than fourfold increase in sample size requirements for successful replication studies, which in part explains some of the apparent failures in replicating the original signals. Our analysis suggests that adjusting for the winner's curse is critical for interpreting findings from whole genome scans and planning replication and meta-GWAS studies, as well as in attempts to translate findings into the clinical setting.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Frequency
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study / statistics & numerical data*
  • Glycated Hemoglobin A / genetics
  • Humans
  • Polymorphism, Single Nucleotide
  • Psoriasis / genetics
  • Quantitative Trait, Heritable
  • Sample Size
  • Statistics, Nonparametric*

Substances

  • Glycated Hemoglobin A
  • hemoglobin A1c protein, human