Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul 29;6(7):e1001030.
doi: 10.1371/journal.ppat.1001030.

Unexpected Inheritance: Multiple Integrations of Ancient Bornavirus and Ebolavirus/Marburgvirus Sequences in Vertebrate Genomes

Free PMC article

Unexpected Inheritance: Multiple Integrations of Ancient Bornavirus and Ebolavirus/Marburgvirus Sequences in Vertebrate Genomes

Vladimir A Belyi et al. PLoS Pathog. .
Free PMC article


Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological advantage to the species. In addition, the viruses could also benefit, as some resistant species (e.g. bats) may serve as natural reservoirs for their persistence and transmission. Given the stringent limitations imposed in this informatics search, the examples described here should be considered a low estimate of the number of such integration events that have persisted over evolutionary time scales. Clearly, the sources of genetic information in vertebrate genomes are much more diverse than previously suspected.

Conflict of interest statement

The authors have declared that no competing interests exist.


Figure 1
Figure 1. Organization and transcription maps of Borna disease virus (BDV), Marburgvirus (MARV) and Ebolavirus (EBOV) genomes.
Open reading frames are labeled and indicated by colored boxes, non-coding regions by empty boxes. For BDV, the locations of transcription initiation (S) and termination (T) sites are shown on the scale beneath the genome map. The horizontal arrows below the scale depict the origins of primary transcripts. The two longest BDV transcripts are subjected to alternative splicing to form multiple mature mRNAs. For MARV and EBOV, vertical arrows indicate transcription initiation and termination sites, except for regions of overlap, where these sites are not marked. The pink arrowhead points to the location of an editing site in the GP gene of EBOV.
Figure 2
Figure 2. Phylogenetic tree of vertebrates that encode Bornavirus- and Filovirus- like proteins in their genomes.
Bornaviruses-related sequences are denoted by icosahedrons and Filoviruses-related sequences by triangles. Times of the viral gene integrations are approximate, unless discussed in the text.
Figure 3
Figure 3. Phylogeny of endogenous Filovirus VP35 - like gene integrations.
The tree was built with PHYLIP based on ClustalW alignment using only aligned residues present in all sequences. The tree is unrooted (the wallaby integration was used as an outgroup for given representation). Bootstrap values are at least 92, with the exception for Sudan EBOV (54), Cote D'Ivore EBOV (77), and MARV in bats (70).
Figure 4
Figure 4. Domain structure of BDV N (p40) protein, and its alignment with open reading frames encoded in human and squirrel endogenous BDV N-like sequences.
Shaded blue rectangles show open reading frames as seen in today's integrations. Solid black lines show total alignment found by BLAST.
Figure 5
Figure 5. Domain structure of the EBOV N protein, and its alignment with several related endogenous sequences identified by the BLAST program.
Amino acid coordinates marked with (&) have been mapped to the Zaire strain of Ebolavirus and may differ slightly from coordinates in Supplemental Table S4.
Figure 6
Figure 6. Comparisons of Filovirus VP35 protein sequences with those of related endogenous sequences.
A) Domain structure of the EBOV (Zaire) VP35 protein, and its alignment with related endogenous sequences in the microbat and tarsier genomes. Shaded blue rectangles show open reading frames as seen in today's integrations. Solid black lines show total alignment found by BLAST; B) multiple alignment of endogenous sequences in wallaby, tarsier, and microbat, with the present day strains of EBOV and MARV. We used the default color scheme for ClustalW alignment in the Jalview program.

Similar articles

See all similar articles

Cited by 105 articles

See all "Cited by" articles


    1. Crochu S, Cook S, Attoui H, Charrel RN, De Chesse R, et al. Sequences of flavivirus-related RNA viruses persist in DNA form integrated in the genome of Aedes spp. mosquitoes. J Gen Virol. 2004;85:1971–1980. - PubMed
    1. Maori E, Lavi S, Mozes-Koch R, Gantman Y, Peretz Y, et al. Isolation and characterization of Israeli acute paralysis virus, a dicistrovirus affecting honeybees in Israel: evidence for diversity due to intra- and inter-species recombination. J Gen Virol. 2007;88:3428–3438. - PubMed
    1. Anne E, Sela I. Occurrence of a DNA sequence of a non-retro RNA virus in a host plant genome and its expression: evidence for recombination between viral and host RNAs. Virology. 2005;332:614–622. - PubMed
    1. Bishop KN, Bock M, Towers G, Stoye JP. Identification of the regions of Fv1 necessary for murine leukemia virus restriction. J Virol. 2001;75:5182–5188. - PMC - PubMed
    1. Horie M, Honda T, Suzuki Y, Kobayashi Y, Daito T, et al. Endogenous non-retroviral RNA virus elements in mammalian genomes. Nature. 2010;463:84–87. - PMC - PubMed

Publication types