A machine learning-based universal outbreak risk prediction tool

Tianyu Zhang; Fethi Rabhi; Xin Chen; Hye-Young Paik; Chandini Raina MacIntyre

doi:10.1016/j.compbiomed.2023.107876

A machine learning-based universal outbreak risk prediction tool

Comput Biol Med. 2024 Feb:169:107876. doi: 10.1016/j.compbiomed.2023.107876. Epub 2023 Dec 24.

Authors

Tianyu Zhang¹, Fethi Rabhi², Xin Chen³, Hye-Young Paik⁴, Chandini Raina MacIntyre⁵

Affiliations

¹ FinanceIT Research Group, University of New South Wales, Sydney, NSW, Australia. Electronic address: TianyuZhangAndrew@outlook.com.
² FinanceIT Research Group, University of New South Wales, Sydney, NSW, Australia.
³ Biosecurity Program, The Kirby Institute, University of New South Wales, Sydney, NSW, 2052, Australia.
⁴ School of Computer Science and Engineering, Faulty of Engineering, University of New South Wales, Sydney, NSW, 2052, Australia.
⁵ Biosecurity Program, The Kirby Institute, University of New South Wales, Sydney, NSW, 2052, Australia; College of Public Service & Community Solutions, Arizona State University, Tempe, AZ, 85004, United States.

PMID: 38176209
DOI: 10.1016/j.compbiomed.2023.107876

Abstract

In order to prevent and control the increasing number of serious epidemics, the ability to predict the risk caused by emerging outbreaks is essential. However, most current risk prediction tools, except EPIRISK, are limited by being designed for targeting only one specific disease and one country. Differences between countries and diseases (e.g., different economic conditions, different modes of transmission, etc.) pose challenges for building models with cross-country and cross-disease prediction capabilities. The limitation of universality affects domestic and international efforts to control and prevent pandemic outbreaks. To address this problem, we used outbreak data from 43 diseases in 206 countries to develop a universal risk prediction system that can be used across countries and diseases. This system used five machine learning models (including Neural Network XGBoost, Logistic Boost, Random Forest and Kernel SVM) to predict and vote together to make ensemble predictions. It can make predictions with around 80%-90 % accuracy from economic, cultural, social, and epidemiological factors. Three different datasets were designed to test the performance of ML models under different realistic situations. This prediction system has strong predictive ability, adaptability, and generality. It can give universal outbreak risk assessment that are not limited by border or disease type, facilitate rapid response to pandemic outbreaks, government decision-making and international cooperation.

Keywords: Epidemics; Machine learning; Outbreak risk prediction; Public health.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Disease Outbreaks*
Machine Learning
Neural Networks, Computer*
Pandemics
Support Vector Machine