Accuracy of imputation to infer unobserved APOE epsilon alleles in genome-wide genotyping data

Eur J Hum Genet. 2014 Oct;22(10):1239-42. doi: 10.1038/ejhg.2013.308. Epub 2014 Jan 22.


Apolipoprotein E, encoded by APOE, is the main apoprotein for catabolism of chylomicrons and very low density lipoprotein. Two common single-nucleotide polymorphisms (SNPs) in APOE, rs429358 and rs7412, determine the three epsilon alleles that are established genetic risk factors for late-onset Alzheimer's disease (AD), cerebral amyloid angiopathy, and intracerebral hemorrhage (ICH). These two SNPs are not present in most commercially available genome-wide genotyping arrays and cannot be inferred through imputation using HapMap reference panels. Therefore, these SNPs are often separately genotyped. Introduction of reference panels compiled from the 1000 Genomes project has made imputation of these variants possible. We compared the directly genotyped and imputed SNPs that define the APOE epsilon alleles to determine the accuracy of imputation for inference of unobserved epsilon alleles. We utilized genome-wide genotype data obtained from two cohorts of ICH and AD constituting subjects of European ancestry. Our data suggest that imputation is highly accurate, yields an acceptable proportion of missing data that is non-differentially distributed across case and control groups, and generates comparable results to genotyped data for hypothesis testing. Further, we explored the effect of imputation algorithm parameters and demonstrated that customization of these parameters yields an improved balance between accuracy and missing data for inferred genotypes.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles*
  • Alzheimer Disease / genetics
  • Apolipoproteins E / genetics*
  • Case-Control Studies
  • Cerebral Hemorrhage / genetics
  • Gene Frequency
  • Genome, Human
  • Genome-Wide Association Study / methods*
  • Genotype
  • Genotyping Techniques
  • HapMap Project
  • Humans
  • Logistic Models
  • Longitudinal Studies
  • Polymorphism, Single Nucleotide
  • Principal Component Analysis
  • Prospective Studies
  • Quality Control
  • White People / genetics


  • Apolipoproteins E