Genotype Imputation in Genome-Wide Association Studies

Curr Protoc Hum Genet. 2019 Jun;102(1):e84. doi: 10.1002/cphg.84.

Abstract

Genotype imputation infers missing genotypes in silico using haplotype information from reference samples with genotypes from denser genotyping arrays or sequencing. This approach can confer a number of improvements on genome-wide association studies: it can improve statistical power to detect associations by reducing the number of missing genotypes; it can simplify data harmonization for meta-analyses by improving overlap of genomic variants between differently-genotyped sample sets; and it can increase the overall number and density of genomic variants available for association testing. This article reviews the general concepts behind imputation, describes imputation approaches and methods for various types of genotype data, including family-based data, and identifies web-based resources that can be used in different steps of the imputation process. For practical application, it provides a step-by-step guide to implementation of a two-step imputation process consisting of phasing of the study genotypes and the imputation of reference panel genotypes into the study haplotypes. In addition, this review describes recently developed haplotype reference panel resources and online imputation servers that are capable of remotely and securely implementing an imputation workflow on uploaded genotype array data. © 2019 by John Wiley & Sons, Inc.

Keywords: 1000 Genomes Project; HapMap Project; genome-wide association studies; imputation; inference; linkage disequilibrium; rare variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Gene Frequency
  • Genetic Linkage
  • Genome, Human
  • Genome-Wide Association Study / methods*
  • Genotyping Techniques / methods*
  • Haplotypes
  • Humans
  • Polymorphism, Single Nucleotide
  • Software
  • Whole Genome Sequencing