Using GWAS summary data to impute traits for genotyped individuals

HGG Adv. 2023 Apr 12;4(3):100197. doi: 10.1016/j.xhgg.2023.100197. eCollection 2023 Jul 13.


Genome-wide association study (GWAS) summary data have become extremely useful in daily routine data analysis, largely facilitating new methods development and new applications. However, a severe limitation with the current use of GWAS summary data is its exclusive restriction to only linear single nucleotide polymorphism (SNP)-trait association analyses. To further expand the use of GWAS summary data, along with a large sample of individual-level genotypes, we propose a nonparametric method for large-scale imputation of the genetic component of the trait for the given genotypes. The imputed individual-level trait values, along with the individual-level genotypes, make it possible to conduct any analysis as with individual-level GWAS data, including nonlinear SNP-trait associations and predictions. We use the UK Biobank data to highlight the usefulness and effectiveness of the proposed method in three applications that currently cannot be done with only GWAS summary data (for SNP-trait associations): marginal SNP-trait association analysis under non-additive genetic models, detection of SNP-SNP interactions, and genetic prediction of a trait using a nonlinear model of SNPs.

Keywords: Linear and nonlinear associations; Nonlinear models; PRS; SNP-SNP interactions; SNP-trait association.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Genome-Wide Association Study* / methods
  • Genotype
  • Humans
  • Phenotype
  • Polymorphism, Single Nucleotide* / genetics