Topology of viral evolution

Proc Natl Acad Sci U S A. 2013 Nov 12;110(46):18566-71. doi: 10.1073/pnas.1313480110. Epub 2013 Oct 29.

Abstract

The tree structure is currently the accepted paradigm to represent evolutionary relationships between organisms, species or other taxa. However, horizontal, or reticulate, genomic exchanges are pervasive in nature and confound characterization of phylogenetic trees. Drawing from algebraic topology, we present a unique evolutionary framework that comprehensively captures both clonal and reticulate evolution. We show that whereas clonal evolution can be summarized as a tree, reticulate evolution exhibits nontrivial topology of dimension greater than zero. Our method effectively characterizes clonal evolution, reassortment, and recombination in RNA viruses. Beyond detecting reticulate evolution, we succinctly recapitulate the history of complex genetic exchanges involving more than two parental strains, such as the triple reassortment of H7N9 avian influenza and the formation of circulating HIV-1 recombinants. In addition, we identify recurrent, large-scale patterns of reticulate evolution, including frequent PB2-PB1-PA-NP cosegregation during avian influenza reassortment. Finally, we bound the rate of reticulate events (i.e., 20 reassortments per year in avian influenza). Our method provides an evolutionary perspective that not only captures reticulate events precluding phylogeny, but also indicates the evolutionary scales where phylogenetic inference could be accurate.

Keywords: gene flow; persistent homology; topological data analysis.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Classification / methods*
  • Computational Biology
  • Computer Simulation
  • Evolution, Molecular*
  • Gene Transfer, Horizontal / genetics*
  • HIV-1 / genetics
  • Influenza A Virus, H1N1 Subtype / genetics
  • Influenza A Virus, H7N9 Subtype / genetics
  • Models, Genetic*
  • Molecular Sequence Annotation
  • Phylogeny*
  • Principal Component Analysis
  • Reassortant Viruses / genetics*
  • Sequence Homology