Machine-learning Prognostic Models from the 2014-16 Ebola Outbreak: Data-harmonization Challenges, Validation Strategies, and mHealth Applications

EClinicalMedicine. 2019 Jun 22:11:54-64. doi: 10.1016/j.eclinm.2019.06.003. eCollection 2019 May-Jun.


Background: Ebola virus disease (EVD) plagues low-resource and difficult-to-access settings. Machine learning prognostic models and mHealth tools could improve the understanding and use of evidence-based care guidelines in such settings. However, data incompleteness and lack of interoperability limit model generalizability. This study harmonizes diverse datasets from the 2014-16 EVD epidemic and generates several prognostic models incorporated into the novel Ebola Care Guidelines app that provides informed access to recommended evidence-based guidelines.

Methods: Multivariate logistic regression was applied to investigate survival outcomes in 470 patients admitted to five Ebola treatment units in Liberia and Sierra Leone at various timepoints during 2014-16. We generated a parsimonious model (viral load, age, temperature, bleeding, jaundice, dyspnea, dysphagia, and time-to-presentation) and several fallback models for when these variables are unavailable. All were externally validated against two independent datasets and compared to further models including expert observational wellness assessments. Models were incorporated into an app highlighting the signs/symptoms with the largest contribution to prognosis.

Findings: The parsimonious model approached the predictive power of observational assessments by experienced clinicians (Area-Under-the-Curve, AUC = 0.70-0.79, accuracy = 0.64-0.74) and maintained its performance across subcohorts with different healthcare seeking behaviors. Age and viral load contributed > 5-fold the weighting of other features and including them in a minimal model had a similar AUC, albeit at the cost of specificity.

Interpretation: Clinically guided prognostic models can recapitulate clinical expertise and be useful when such expertise is unavailable. Incorporating these models into mHealth tools may facilitate their interpretation and provide informed access to comprehensive clinical guidelines.

Funding: Howard Hughes Medical Institute, US National Institutes of Health, Bill & Melinda Gates Foundation, International Medical Corps, UK Department for International Development, and GOAL Global.

Keywords: Clinical intuition; Data visualization; Ebola virus disease; Machine learning; Prognostic models; Severity score; Supportive care guidelines; mHealth.