Efficient polygenic risk scores for biobank scale data by exploiting phenotypes from inferred relatives

Nat Commun. 2020 Jun 17;11(1):3074. doi: 10.1038/s41467-020-16829-x.

Abstract

Polygenic risk scores are emerging as a potentially powerful tool to predict future phenotypes of target individuals, typically using unrelated individuals, thereby devaluing information from relatives. Here, for 50 traits from the UK Biobank data, we show that a design of 5,000 individuals with first-degree relatives of target individuals can achieve a prediction accuracy similar to that of around 220,000 unrelated individuals (mean prediction accuracy = 0.26 vs. 0.24, mean fold-change = 1.06 (95% CI: 0.99-1.13), P-value = 0.08), despite a 44-fold difference in sample size. For lifestyle traits, the prediction accuracy with 5,000 individuals including first-degree relatives of target individuals is significantly higher than that with 220,000 unrelated individuals (mean prediction accuracy = 0.22 vs. 0.16, mean fold-change = 1.40 (1.17-1.62), P-value = 0.025). Our findings suggest that polygenic prediction integrating family information may help to accelerate precision health and clinical intervention.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Specimen Banks*
  • Family Health*
  • Female
  • Genetic Predisposition to Disease
  • Genome, Human
  • Genome-Wide Association Study
  • Genotype
  • Humans
  • Life Style
  • Male
  • Models, Genetic
  • Multifactorial Inheritance*
  • Pedigree
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Reproducibility of Results
  • Risk Assessment / methods*
  • United Kingdom