Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 6, 223

Eukaryotic Large Nucleo-Cytoplasmic DNA Viruses: Clusters of Orthologous Genes and Reconstruction of Viral Genome Evolution

Affiliations

Eukaryotic Large Nucleo-Cytoplasmic DNA Viruses: Clusters of Orthologous Genes and Reconstruction of Viral Genome Evolution

Natalya Yutin et al. Virol J.

Abstract

Background: The Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) comprise an apparently monophyletic class of viruses that infect a broad variety of eukaryotic hosts. Recent progress in isolation of new viruses and genome sequencing resulted in a substantial expansion of the NCLDV diversity, resulting in additional opportunities for comparative genomic analysis, and a demand for a comprehensive classification of viral genes.

Results: A comprehensive comparison of the protein sequences encoded in the genomes of 45 NCLDV belonging to 6 families was performed in order to delineate cluster of orthologous viral genes. Using previously developed computational methods for orthology identification, 1445 Nucleo-Cytoplasmic Virus Orthologous Groups (NCVOGs) were identified of which 177 are represented in more than one NCLDV family. The NCVOGs were manually curated and annotated and can be used as a computational platform for functional annotation and evolutionary analysis of new NCLDV genomes. A maximum-likelihood reconstruction of the NCLDV evolution yielded a set of 47 conserved genes that were probably present in the genome of the common ancestor of this class of eukaryotic viruses. This reconstructed ancestral gene set is robust to the parameters of the reconstruction procedure and so is likely to accurately reflect the gene core of the ancestral NCLDV, indicating that this virus encoded a complex machinery of replication, expression and morphogenesis that made it relatively independent from host cell functions.

Conclusions: The NCVOGs are a flexible and expandable platform for genome analysis and functional annotation of newly characterized NCLDV. Evolutionary reconstructions employing NCVOGs point to complex ancestral viruses.

Figures

Figure 1
Figure 1
Distribution of the number of NCLDV families represented in NCVOGs.
Figure 2
Figure 2
Distribution of the number of NCLDV species represented in NCVOGs.
Figure 3
Figure 3
Numbers of NCVOGs that include paralogs in each of the analyzed viruses.
Figure 4
Figure 4
Fractions of NCVOGs that include paralogs in each of the analyzed viruses.
Figure 5
Figure 5
Distribution of the 1268 family-specific NCVOGs among the 6 NCLDV families.
Figure 6
Figure 6
Functional classification of the 177 NCVOGs that include two or more NCLDV families.
Figure 7
Figure 7
The consensus phylogenetic tree of the NCLDV. The Expected Likelihood Weights (1,000 replications) are indicated for each ancestral node as percentage points. The topology of the tree was derived as the consensus of the tree topologies for the following 10 (nearly) universal NCVOGs: Superfamily II helicase (NCVOG0076), A2L-like transcription factor (NCVOG0262), RNA polymerase α subunit (NCVOG0274), RNA polymerase β subunit (NCVOG0271), mRNA capping enzyme, A32-like packaging ATPase (NCVOG0249), small subunit of ribonucleotide reductase (NCVOG0276), Myristylated envelope protein (NCVOG0211), primase-helicase (NCVOG0023), and DNA polymerase (NCVOG0038) (See Additional File 2). The branch lengths and ELW values (shown as percentage points) are from a tree that was constructed from a concatenated alignment of 4 universal proteins (primase-helicase, DNA polymerase, packaging ATPase, and A2L-like transcription factor).
Figure 8
Figure 8
Reconstruction of the ancestral NCLDV gene sets. The inferred numbers of genes present in each internal node are shown in blue. Numbers of NCVOGs present with the likelihood greater than 0.9 for 9 deepest nodes (numbered) are shown in red. For the complete list of these NCVOGs, see Additional File 4. The tree from Figure 3 was used as a guide for the reconstruction.
Figure 9
Figure 9
The size of reconstructed ancestral gene sets depending on the likelihood threshold.

Similar articles

See all similar articles

Cited by 108 articles

See all "Cited by" articles

References

    1. Fields BN, Howley PM, Griffin DE, Lamb RA, Martin MA, Roizman B, Straus SE, Knipe DM, (eds.) Fields Virology. New York: Lippincott Williams & Wilkins; 2001.
    1. Forterre P. The origin of viruses and their possible roles in major evolutionary transitions. Virus Res. 2006;117(1):5–16. doi: 10.1016/j.virusres.2006.01.010. - DOI - PubMed
    1. Raoult D, Forterre P. Redefining viruses: lessons from Mimivirus. Nat Rev Microbiol. 2008;6(4):315–319. doi: 10.1038/nrmicro1858. - DOI - PubMed
    1. Koonin EV, Senkevich TG, Dolja VV. The ancient Virus World and evolution of cells. Biol Direct. 2006;1:29. doi: 10.1186/1745-6150-1-29. - DOI - PMC - PubMed
    1. Iyer LM, Aravind L, Koonin EV. Common origin of four diverse families of large eukaryotic DNA viruses. J Virol. 2001;75(23):11720–11734. doi: 10.1128/JVI.75.23.11720-11734.2001. - DOI - PMC - PubMed

Publication types

LinkOut - more resources

Feedback