Viral dark matter and virus-host interactions resolved from publicly available microbial genomes

Elife. 2015 Jul 22;4:e08490. doi: 10.7554/eLife.08490.

Abstract

The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus-host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7-38% of 'unknown' sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus-host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.

Keywords: ecology; evolutionary biology; genomics; none; phage; prophage; virus; virus-host adaptation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Archaea / genetics*
  • Archaea / virology*
  • Bacteria / genetics*
  • Bacteria / virology*
  • Genome, Microbial*
  • Host-Pathogen Interactions*
  • Viruses / genetics*

Grant support

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.