Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 7 (11), 2169-77

Previously Unknown and Highly Divergent ssDNA Viruses Populate the Oceans

Affiliations

Previously Unknown and Highly Divergent ssDNA Viruses Populate the Oceans

Jessica M Labonté et al. ISME J.

Abstract

Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.

Figures

Figure 1
Figure 1
BLAST comparison of the contigs against (a) ssDNA viral families (e-value <10−5) and (b) the NCBI database (e-value <10−3) for the SOG(1260 contigs), SI(2399 contigs) and GOM (1336 contigs).
Figure 2
Figure 2
FFP analyses of ssDNA virus isolates from the NCBI database and genomes from this study. Neighbor-joining tree (left) and multidimensional scaling (right) (goodness of fit=0.6495) of viral isolates (crosses) and composite genomes (dots) demonstrates that FPP of heptamers is able to resolve evolutionary relationships among ssDNA viruses. The shaded areas emphasize the established families of ssDNA viruses and the new evolutionary clusters identified in this study.
Figure 3
Figure 3
Network representation of the BLAST comparisons of the environmental genomes (i.e. genomes assembled from metagenomic data) with previously known ssDNA viruses (e-value <10−5) and with other environmental metagenomes (e-value <10−10). Each node represents a complete metagenome or genome (circle: isolate; triangle: GOM; diamond: SI; square: SOG) and each link represents a BLAST hit. The color of the outline of the node represents the viral family (blue: Geminiviridae; green: Nanoviridae; dark red: Circoviridae; orange: Parvoviridae; olive green: Microviridae) or the color of the cluster assigned in the FFP analysis from Figure 2 (blue: Cluster 1, dark blue: Cluster 3, aqua: Cluster 4 and purple: Cluster 5). The shaded areas highlight those genomes that have a conserved CDS in common. A solid colored node means that the genome contains the full-length conserved protein. The shaded gray boxes (upper left) encompass whole genomes with either no recognizable similarity to other genomes (singletons) or with similarity to one other genome (doublets).
Figure 4
Figure 4
Relative percentage of contigs from each of the viral groups identified in this study for the SOG, SSI and GOM as determined by BLAST comparison (e-value <10−10).
Figure 5
Figure 5
Unrooted phylogenetic analysis (maximum likelihood; model WAG; 100 bootstrap replicates) representing the genetic relatedness of the rolling-circle replication protein of nanoviruses (green), geminiviruses (blue), circoviruses (red), cycloviruses (purple) and the environmental sequences (black dots: this study, gray dots: other studies). The black, dark gray and light gray branches represent >90%, 75–89% and <75% bootstrap support, respectively. Black and gray asterisks at internal nodes represent at least 90% and 75% aLTR bootstrap support, respectively. Roman numerals represent new deeply branched phylogenetic groups of the rolling-circle replication protein from viruses that likely infect phytoplankton (green), zooplankton (red) or other protists (blue).

Similar articles

See all similar articles

Cited by 65 PubMed Central articles

See all "Cited by" articles

Publication types

Substances

Associated data