Pervasive Chimerism in the Replication-Associated Proteins of Uncultured Single-Stranded DNA Viruses

Viruses. 2018 Apr 10;10(4):187. doi: 10.3390/v10040187.

Abstract

Numerous metagenomic studies have uncovered a remarkable diversity of circular replication-associated protein (Rep)-encoding single-stranded (CRESS) DNA viruses, the majority of which are uncultured and unclassified. Unlike capsid proteins, the Reps show significant similarity across different groups of CRESS DNA viruses and have conserved domain organization with the N-terminal nuclease and the C-terminal helicase domain. Consequently, Rep is widely used as a marker for identification, classification and assessment of the diversity of CRESS DNA viruses. However, it has been shown that in certain viruses the Rep nuclease and helicase domains display incongruent evolutionary histories. Here, we systematically evaluated the co-evolutionary patterns of the two Rep domains across classified and unclassified CRESS DNA viruses. Our analysis indicates that the Reps encoded by members of the families Bacilladnaviridae, Circoviridae, Geminiviridae, Genomoviridae, Nanoviridae and Smacoviridae display largely congruent evolutionary patterns in the two domains. By contrast, among the unclassified CRESS DNA viruses, 71% appear to have chimeric Reps. Such massive chimerism suggests that unclassified CRESS DNA viruses represent a dynamic population in which exchange of gene fragments encoding the nuclease and helicase domains is extremely common. Furthermore, purging of the chimeric sequences uncovered six monophyletic Rep groups that may represent new families of CRESS DNA viruses.

Keywords: CRESS DNA viruses; HUH endonuclease domain; recombination; rolling-circle replication initiation proteins; ssDNA viruses; superfamily 3 helicase domain; virus evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chimerism*
  • DNA Viruses / classification*
  • DNA Viruses / genetics*
  • DNA, Single-Stranded / genetics*
  • Evolution, Molecular
  • Genome, Viral / genetics
  • Metagenomics
  • Phylogeny*
  • Protein Domains / genetics
  • Viral Proteins / genetics

Substances

  • DNA, Single-Stranded
  • Viral Proteins