Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction

Nat Commun. 2021 Sep 6;12(1):5276. doi: 10.1038/s41467-021-25014-7.


A promise of genomics in precision medicine is to provide individualized genetic risk predictions. Polygenic risk scores (PRS), computed by aggregating effects from many genomic variants, have been developed as a useful tool in complex disease research. However, the application of PRS as a tool for predicting an individual's disease susceptibility in a clinical setting is challenging because PRS typically provide a relative measure of risk evaluated at the level of a group of people but not at individual level. Here, we introduce a machine-learning technique, Mondrian Cross-Conformal Prediction (MCCP), to estimate the confidence bounds of PRS-to-disease-risk prediction. MCCP can report disease status conditional probability value for each individual and give a prediction at a desired error level. Moreover, with a user-defined prediction error rate, MCCP can estimate the proportion of sample (coverage) with a correct prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Age Factors
  • Biological Specimen Banks
  • Breast Neoplasms / genetics
  • Coronary Artery Disease / genetics
  • Diabetes Mellitus, Type 2 / genetics
  • Female
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • Inflammatory Bowel Diseases / genetics
  • Machine Learning*
  • Male
  • Multifactorial Inheritance / genetics*
  • Reproducibility of Results
  • Schizophrenia / genetics
  • Sweden
  • United Kingdom