BlindCall: ultra-fast base-calling of high-throughput sequencing data by blind deconvolution

Bioinformatics. 2014 May 1;30(9):1214-9. doi: 10.1093/bioinformatics/btu010. Epub 2014 Jan 9.

Abstract

Motivation: Base-calling of sequencing data produced by high-throughput sequencing platforms is a fundamental process in current bioinformatics analysis. However, existing third-party probabilistic or machine-learning methods that significantly improve the accuracy of base-calls on these platforms are impractical for production use due to their computational inefficiency.

Results: We directly formulate base-calling as a blind deconvolution problem and implemented BlindCall as an efficient solver to this inverse problem. BlindCall produced base-calls at accuracy comparable to state-of-the-art probabilistic methods while processing data at rates 10 times faster in most cases. The computational complexity of BlindCall scales linearly with read length making it better suited for new long-read sequencing technologies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Probability
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*
  • Software
  • Time Factors