Predicting reservoir hosts and arthropod vectors from evolutionary signatures in RNA virus genomes

Science. 2018 Nov 2;362(6414):577-580. doi: 10.1126/science.aap9072.


Identifying the animal origins of RNA viruses requires years of field and laboratory studies that stall responses to emerging infectious diseases. Using large genomic and ecological datasets, we demonstrate that animal reservoirs and the existence and identity of arthropod vectors can be predicted directly from viral genome sequences via machine learning. We illustrate the ability of these models to predict the epidemiology of diverse viruses across most human-infective families of single-stranded RNA viruses, including 69 viruses with previously elusive or never-investigated reservoirs or vectors. Models such as these, which capitalize on the proliferation of low-cost genomic sequencing, can narrow the time lag between virus discovery and targeted research, surveillance, and management.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arthropod Vectors / genetics*
  • Biodiversity
  • Communicable Diseases, Emerging / prevention & control*
  • Communicable Diseases, Emerging / virology
  • Disease Reservoirs / virology*
  • Epidemiological Monitoring*
  • Evolution, Molecular
  • Genome, Viral
  • Genomics
  • Host-Pathogen Interactions*
  • Humans
  • Machine Learning*
  • RNA Virus Infections / prevention & control*
  • RNA Virus Infections / virology
  • RNA Viruses / classification*
  • RNA Viruses / genetics*