Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies

Nat Genet. 2013 Apr;45(4):400-5, 405e1-3. doi: 10.1038/ng.2579. Epub 2013 Mar 3.


We report a new method to estimate the predictive performance of polygenic models for risk prediction and assess predictive performance for ten complex traits or common diseases. Using estimates of effect-size distribution and heritability derived from current studies, we project that although 45% of the variance of height has been attributed to SNPs, a model trained on one million people may only explain 33.4% of variance of the trait. Models based on current studies allow for identification of 3.0%, 1.1% and 7.0% of the populations at twofold or higher than average risk for type 2 diabetes, coronary artery disease and prostate cancer, respectively. Tripling of sample sizes could elevate these percentages to 18.8%, 6.1% and 12.2%, respectively. The utility of polygenic models for risk prediction will depend on achievable sample sizes for the training data set, the underlying genetic architecture and the inclusion of information on other risk factors, including family history.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Algorithms*
  • Disease / genetics*
  • Female
  • Genome-Wide Association Study*
  • Humans
  • Models, Genetic
  • Models, Statistical*
  • Multifactorial Inheritance / genetics*
  • Risk Assessment
  • Risk Factors