Fast and accurate mutation detection in whole genome sequences of multiple isogenic samples with IsoMut

BMC Bioinformatics. 2017 Jan 31;18(1):73. doi: 10.1186/s12859-017-1492-4.

Abstract

Background: Detection of somatic mutations is one of the main goals of next generation DNA sequencing. A wide range of experimental systems are available for the study of spontaneous or environmentally induced mutagenic processes. However, most of the routinely used mutation calling algorithms are not optimised for the simultaneous analysis of multiple samples, or for non-human experimental model systems with no reliable databases of common genetic variations. Most standard tools either require numerous in-house post filtering steps with scarce documentation or take an unpractically long time to run. To overcome these problems, we designed the streamlined IsoMut tool which can be readily adapted to experimental scenarios where the goal is the identification of experimentally induced mutations in multiple isogenic samples.

Methods: Using 30 isogenic samples, reliable cohorts of validated mutations were created for testing purposes. Optimal values of the filtering parameters of IsoMut were determined in a thorough and strict optimization procedure based on these test sets.

Results: We show that IsoMut, when tuned correctly, decreases the false positive rate compared to conventional tools in a 30 sample experimental setup; and detects not only single nucleotide variations, but short insertions and deletions as well. IsoMut can also be run more than a hundred times faster than the most precise state of art tool, due its straightforward and easily understandable filtering algorithm.

Conclusions: IsoMut has already been successfully applied in multiple recent studies to find unique, treatment induced mutations in sets of isogenic samples with very low false positive rates. These types of studies provide an important contribution to determining the mutagenic effect of environmental agents or genetic defects, and IsoMut turned out to be an invaluable tool in the analysis of such data.

Keywords: Demonstrative algorithm; Low false positive rate; Multiple isogenic samples; Mutagenesis; Next generation sequencing; Somatic mutation detection.

MeSH terms

  • Algorithms
  • DNA Mutational Analysis / methods*
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Mutation
  • Sequence Deletion
  • Software*