How to Identify Pathogenic Mutations among All Those Variations: Variant Annotation and Filtration in the Genome Sequencing Era

Hum Mutat. 2016 Dec;37(12):1272-1282. doi: 10.1002/humu.23110. Epub 2016 Sep 26.


High-throughput sequencing technologies have become fundamental for the identification of disease-causing mutations in human genetic diseases both in research and clinical testing contexts. The cumulative number of genes linked to rare diseases is now close to 3,500 with more than 1,000 genes identified between 2010 and 2014 because of the early adoption of Exome Sequencing technologies. However, despite these encouraging figures, the success rate of clinical exome diagnosis remains low due to several factors including wrong variant annotation and nonoptimal filtration practices, which may lead to misinterpretation of disease-causing mutations. In this review, we describe the critical steps of variant annotation and filtration processes to highlight a handful of potential disease-causing mutations for downstream analysis. We report the key annotation elements to gather at multiple levels for each mutation, and which systems are designed to help in collecting this mandatory information. We describe the filtration options, their efficiency, and limits and provide a generic filtration workflow and highlight potential pitfalls through a use case.

Keywords: good practices; high-throughput sequencing; pathogenic mutation; variant annotation; variant filtration.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exome
  • Genetic Predisposition to Disease
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Molecular Sequence Annotation / methods*
  • Mutation*
  • Sequence Analysis, DNA / methods
  • Software