Analysis of insertion-deletion from deep-sequencing data: software evaluation for optimal detection

Brief Bioinform. 2013 Jan;14(1):46-55. doi: 10.1093/bib/bbs013. Epub 2012 Mar 24.

Abstract

Insertion and deletion (indel) mutations, the most common type of structural variance in the human genome, affect a multitude of human traits and diseases. New sequencing technologies, such as deep sequencing, allow massive throughput of sequence data and greatly contribute to the field of disease causing mutation detection, in general, and indel detection, specifically. In order to infer indel presence (indel calling), the deep-sequencing data have to undergo comprehensive computational analysis. Selecting which indel calling software to use can often skew the results and inherent tool limitations may affect downstream analysis. In order to better understand these inter-software differences, we evaluated the performance of several indel calling software for short indel (1-10 nt) detection. We compared the software's sensitivity and predictive values in the presence of varying parameters such as read depth (coverage), read length, indel size and frequency. We pinpoint several key features that assist successful experimental design and appropriate tool selection. Our study may also serve as a basis for future evaluation of additional indel calling methods.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / statistics & numerical data*
  • Computer Simulation
  • DNA Mutational Analysis* / statistics & numerical data
  • Humans
  • INDEL Mutation*
  • Mutagenesis, Insertional
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA* / statistics & numerical data
  • Sequence Deletion
  • Software*