Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 6 (11), e1001191

Endogenous Viral Elements in Animal Genomes

Affiliations

Endogenous Viral Elements in Animal Genomes

Aris Katzourakis et al. PLoS Genet.

Abstract

Integration into the nuclear genome of germ line cells can lead to vertical inheritance of retroviral genes as host alleles. For other viruses, germ line integration has only rarely been documented. Nonetheless, we identified endogenous viral elements (EVEs) derived from ten non-retroviral families by systematic in silico screening of animal genomes, including the first endogenous representatives of double-stranded RNA, reverse-transcribing DNA, and segmented RNA viruses, and the first endogenous DNA viruses in mammalian genomes. Phylogenetic and genomic analysis of EVEs across multiple host species revealed novel information about the origin and evolution of diverse virus groups. Furthermore, several of the elements identified here encode intact open reading frames or are expressed as mRNA. For one element in the primate lineage, we provide statistically robust evidence for exaptation. Our findings establish that genetic material derived from all known viral genome types and replication strategies can enter the animal germ line, greatly broadening the scope of paleovirological studies and indicating a more significant evolutionary role for gene flow from virus to animal genomes than has previously been recognized.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Viral replication strategies, endogenous viral elements, and the genomic fossil record.
Animal viruses exhibit a range of genome types and replication strategies. While all viruses must produce mRNA in order to express proteins, steps between entry into the cell and the expression of mRNA vary greatly. Examples of the known animal virus replication strategies are shown to the left of the figure, with the representative families listed for each case. Arrows indicate steps in replication. Red lines indicate pathways that lead to viral genetic material becoming integrated into the nuclear genome of the host cell. Retroviruses are unique amongst animal viruses in that integration occurs as an obligate step in replication. For all other animal viruses integration occurs anomalously, through interaction with cellular retroelements such as LINEs, or via non-homologous recombination with genomic DNA. If integration occurs in a germ line cell that goes on to develop into a viable host organism, an EVE is formed. Green lines show the evolution of an EVE in its host lineage. In the example given, the EVE reaches genetic fixation at the point indicated, and is inherited by all descendant hosts thereafter. Assuming that insertion occurs randomly, the presence of related EVEs at the same locus in both descendant species A and B indicates that insertion occurred prior to their divergence, allowing a minimum age for the insertion to be inferred from the estimated timescale of their evolution. Conversely, the presence of an empty insertion site in species C provides a maximum age for the insertion. Abbreviations: dsDNA (double stranded DNA); ssDNA (single stranded DNA, dsRNA (double stranded RNA); RNA-ve (negative sense, single stranded RNA); RNA-ve (negative sense, single stranded RNA); RNA+ve (positive sense, single stranded RNA).
Figure 2
Figure 2. Genetic structures and phylogenetic relationships of Mononegavirus EVEs.
(a) Summary genetic structures of Mononegavirus EVE sets (Borna-, Rhabdo- and Filoviridae) shown relative to genus type species. The most intact elements are shown for each host taxon. Vertical lines between EVEs in the same host species that are derived from distinct genes indicate that the EVEs are not contiguous in the host genome. Abbreviations for viral type species (bold), host species (italics), and host taxa (bold, italic, underline) are indicated to the left of each EVE. Taxonomic groups are shown for EVE insertions identified as orthologs. Poly-A tails are shown for EVEs that had these features. Intact ORFs (circles) and expressed sequences (crosses) are indicated. Phylogenetic relationships of (b) bornavirus, (c) rhabdovirus and (d) filovirus EVEs and representative exogenous viruses. Taxa that are shown as genetic structures in (a) are indicated by colored squares. Support for the ML phylogenetic trees was evaluated using 1,000 nonparametric bootstrap replicates, and all three trees are midpoint rooted for display purposes. Abbreviations: BDV = Borna disease virus; ZEBOV = Zaire ebola virus; VSV = vesicular stomatitis virus, L-pol = L-polymerase.
Figure 3
Figure 3. Genetic structures and phylogenetic relationships of EVEs related to segmented RNA viruses.
Summary genetic structures of EVEs derived from segmented RNA viruses (a) Reoviridae (Seadornavirus genus), (b) Orthomyxoviridae (Quarjavirus genus), (c) Bunyaviridae (Nairovirus and Phlebovirus genera)) shown relative to the genus type species. The most intact elements are shown for each host taxon. Intact ORFs (circles) and expressed sequences (crosses) are indicated. Maximum likelihood phylogenies of EVEs and exogenous viruses are shown; (d) Reoviridae (segment 5) (e) Orthomyxoviridae (GP), (f) Nairovirus (NP) (g) Phlebovirus (NP) Colored boxes indicate taxa that are shown as genetic structures in panels a-c. Support for trees was evaluated using 1,000 nonparametric bootstrap replicates. Abbreviations: LNV = Liaoning virus; CCHF = Crimean-Congo hemorrhagic fever virus; UUKV = Uukuniemi virus; QRFV = Quaranfil virus.
Figure 4
Figure 4. Genetic structures and phylogenetic relationships of EVEs related to flaviviruses and hepadnaviruses.
(a) Genetic structures of non-overlapping flavivirus EVEs in the Aedes aegyptii genome. (b) Phylogenetic relationship of consensus flavivirus EVE sequences (spanning most of the region shown in (a) with exogenous and endogenous flaviruses. (c) Genetic structures of non-overlapping rtDNA (hepadnavirus) EVEs shown relative to the genus type species. Numbers to the left indicate the T. guttata chromosome on which the EVE is present. (d) Phylogenetic relationships of consensus zebrafinch EVEs and representative exogenous viruses. Avihepadnavirus genus is rooted on Woodchuck HBV (Orthohepadnavirus). All ML phylogenetic trees were inferred from amino acid alignments using the best-fitting model of evolution. Support for trees was evaluated using 1,000 nonparametric bootstrap replicates. Abbreviations: KRV = Kamiti River virus, HBV = hepatitis B virus, dHBV = duck hepatitis B virus.
Figure 5
Figure 5. Genetic structures and phylogenetic relationships of EVEs related to ssDNA viruses.
(a) Summary genetic structures of ssDNA EVE sets shown relative to the genus type species. The most intact elements are shown for each host taxon. EVE hosts (bold) and abbreviations for viral type species (bold, underline), and the total number of matches identified (italic) are indicated to the left of each EVE structure. Bars behind ORFs indicate non-coding viral DNA. Intact ORFs (circles) and expressed sequences (crosses) are indicated. 1 The M. lucifugus element is a composite of two genomic contigs; 2 Structures represent a composite of semi-overlapping fragments; 3 Element has undergone genomic rearrangements, with the arrow indicating the direction of the rearranged fragment. (b) Phylogenetic relationships of dependovirus EVEs and representative exogenous viruses, based on NS1 gene and rooted on snake parvovirus. A dolphin EVE (indicated by an asterisk) groups robustly with avian rather than mammalian isolates. (d) Phylogenetic relationships of parvovirus EVEs and representative exogenous viruses, based on NS1 gene and rooted on Aleutian mink disease virus. Support for both ML phylogenetic trees was evaluated using 1,000 nonparametric bootstrap replicates. EVEs potentially comprising a new genus are indicated. (d) Phylogenetic relationships of circovirus EVEs and representative exogenous viruses, based on the Rep gene and rooted on avian circoviruses, with support for the ML phylogenetic tree evaluated using 1,000 nonparametric bootstrap replicates. Taxa that are shown as genetic structures in (a) are indicated by colored squares (EVEs) and circles (exogenous viruses) (a). Abbreviations: AAV = adeno-associated virus; MMV = minute virus of mice, AMDV = Aleutian mink disease virus, PV = parvovirus, CV = circovirus, PCV = porcine circovirus; HSAV =  Human stool-associated circular virus.
Figure 6
Figure 6. Timescaled phylogenetic tree of mammals screened in this study (after Bininda-Emonds et al [42]) showing the known distribution of EVEs and of exogenous Borna-, Filo-, Circo-, and Parvoviruses.
Grey circles indicate nodes at which orthologous EVE insertions were identified. For all orthologous insertions identified here and elsewhere , , the virus family and genomic region represented by the ortholog is shown. Abbreviation: TLS = Thirteen-lined ground squirrel.
Figure 7
Figure 7. Evolution of EBLN elements in primates.
(a) The primate clade marked by an asterisk in the phylogeny shown in Figure 6 is shown in greater detail here, with the number of stop codons in the EBLN-1 locus indicated for seven species. Orthology across these species indicates that EBLN-1 predates the divergence of these species 54 million years ago . Monte Carlo simulations in which a consensus EBLN sequence was allowed to neutrally evolve at the primate neutral rate for this length of time showed that the average number of stop codons expected after this time is fifteen. (b) The distribution of the number of stop codons from 100,000 simulation replicates. Confidence intervals are indicated.

Similar articles

See all similar articles

Cited by 205 PubMed Central articles

See all "Cited by" articles

References

    1. Benveniste RE, Todaro GJ. Evolution of C-type viral genes: inheritance of exogenously acquired viral genes. Nature. 1974;252:456–459. - PubMed
    1. Jaenisch R. Germ line integration and Mendelian transmission of the exogenous Moloney leukemia virus. Proc Natl Acad Sci U S A. 1976;73:1260–1264. - PMC - PubMed
    1. Bejarano ER, Khashoggi A, Witty M, Lichtenstein C. Integration of multiple repeats of geminiviral DNA into the nuclear genome of tobacco during evolution. Proc Natl Acad Sci U S A. 1996;93:759–764. - PMC - PubMed
    1. Herniou E, Martin J, Miller K, Cook J, Wilkinson M, et al. Retroviral diversity and distribution in vertebrates. J Virol. 1998;72:5955–5966. - PMC - PubMed
    1. Tristem M. Identification and characterization of novel human endogenous retrovirus families by phylogenetic screening of the human genome mapping project database. J Virol. 2000;74:3715–3730. - PMC - PubMed

Publication types

Feedback