Increase in taxonomic assignment efficiency of viral reads in metagenomic studies

Virus Res. 2018 Jan 15:244:230-234. doi: 10.1016/j.virusres.2017.11.011. Epub 2017 Nov 14.

Abstract

Metagenomics studies have revolutionized the field of biology by revealing the presence of many previously unisolated and uncultured micro-organisms. However, one of the main problems encountered in metagenomic studies is the high percentage of sequences that cannot be assigned taxonomically using commonly used similarity-based approaches (e.g. BLAST or HMM). These unassigned sequences are allegorically called « dark matter » in the metagenomic literature and are often referred to as being derived from new or unknown organisms. Here, based on published and original metagenomic datasets coming from virus-like particle enriched samples, we present and quantify the improvement of viral taxonomic assignment that is achievable with a new similarity-based approach. Indeed, prior to any use of similarity based taxonomic assignment methods, we propose assembling contigs from short reads as is currently routinely done in metagenomic studies, but then to further map unassembled reads to the assembled contigs. This additional mapping step increases significantly the proportions of taxonomically assignable sequence reads from a variety -plant, insect and environmental (estuary, lakes, soil, feces) - of virome studies.

Keywords: BLAST; Dark matter; Mapping; Viral metagenomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Contig Mapping / methods*
  • Databases, Genetic
  • Datasets as Topic
  • Feces / virology
  • Fresh Water / virology
  • Gene Ontology
  • Genome, Viral*
  • Humans
  • Insecta / virology
  • Metagenomics / methods*
  • Molecular Sequence Annotation
  • Molecular Typing
  • Plants / virology
  • Sequence Analysis, DNA
  • Soil Microbiology
  • Viruses / classification*
  • Viruses / genetics*
  • Viruses / isolation & purification