From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline

Curr Protoc Bioinformatics. 2013;43(1110):11.10.1-11.10.33. doi: 10.1002/0471250953.bi1110s43.


This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high-quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK.

Keywords: NGS; WGS; exome; genotyping; variant detection.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Calibration
  • Databases, Genetic
  • Genetic Variation*
  • Genome, Human*
  • Haploidy
  • Haplotypes / genetics
  • Humans
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Alignment
  • Software*