Divide-and-conquer: machine-learning integrates mammalian and viral traits with network features to predict virus-mammal associations

Nat Commun. 2021 Jun 25;12(1):3954. doi: 10.1038/s41467-021-24085-w.

Abstract

Our knowledge of viral host ranges remains limited. Completing this picture by identifying unknown hosts of known viruses is an important research aim that can help identify and mitigate zoonotic and animal-disease risks, such as spill-over from animal reservoirs into human populations. To address this knowledge-gap we apply a divide-and-conquer approach which separates viral, mammalian and network features into three unique perspectives, each predicting associations independently to enhance predictive power. Our approach predicts over 20,000 unknown associations between known viruses and susceptible mammalian species, suggesting that current knowledge underestimates the number of associations in wild and semi-domesticated mammals by a factor of 4.3, and the average potential mammalian host-range of viruses by a factor of 3.2. In particular, our results highlight a significant knowledge gap in the wild reservoirs of important zoonotic and domesticated mammals' viruses: specifically, lyssaviruses, bornaviruses and rotaviruses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Disease Reservoirs / virology
  • Host Specificity
  • Humans
  • Machine Learning*
  • Mammals / classification
  • Mammals / physiology
  • Mammals / virology*
  • Reproducibility of Results
  • Virus Diseases / transmission
  • Virus Diseases / virology
  • Virus Physiological Phenomena*
  • Viruses / classification
  • Zoonoses / transmission
  • Zoonoses / virology

Associated data

  • figshare/10.6084/m9.figshare.13270304