MAGERI: Computational pipeline for molecular-barcoded targeted resequencing

PLoS Comput Biol. 2017 May 5;13(5):e1005480. doi: 10.1371/journal.pcbi.1005480. eCollection 2017 May.

Abstract

Unique molecular identifiers (UMIs) show outstanding performance in targeted high-throughput resequencing, being the most promising approach for the accurate identification of rare variants in complex DNA samples. This approach has application in multiple areas, including cancer diagnostics, thus demanding dedicated software and algorithms. Here we introduce MAGERI, a computational pipeline that efficiently handles all caveats of UMI-based analysis to obtain high-fidelity mutation profiles and call ultra-rare variants. Using an extensive set of benchmark datasets including gold-standard biological samples with known variant frequencies, cell-free DNA from tumor patient blood samples and publicly available UMI-encoded datasets we demonstrate that our method is both robust and efficient in calling rare variants. The versatility of our software is supported by accurate results obtained for both tumor DNA and viral RNA samples in datasets prepared using three different UMI-based protocols.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / blood
  • Biomarkers, Tumor / genetics
  • Computational Biology / methods*
  • Databases, Genetic
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasms / genetics
  • RNA, Viral / genetics
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, RNA / methods
  • Software*

Substances

  • Biomarkers, Tumor
  • RNA, Viral

Grant support

Study was supported by Russian Science Foundation grant №14-35-00105 in part of oncodiagnostics development, by RFBR grant 15-34-21052 in part of development of empirical model of PCR errors, by the Ministry of Education, Youth and Sports of the Czech Republic under the project CEITEC 2020 (LQ1601) in part of data analysis, by European Union’s Horizon 2020 research and innovation programme under grant agreement No 633592 (APERIM) in part of developing algorithms to detect potential immunotherapy targets. This publication reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.