Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 7;420:68-81.
doi: 10.1016/j.jtbi.2017.01.032. Epub 2017 Jan 24.

Model Distinguishability and Inference Robustness in Mechanisms of Cholera Transmission and Loss of Immunity

Affiliations
Free PMC article

Model Distinguishability and Inference Robustness in Mechanisms of Cholera Transmission and Loss of Immunity

Elizabeth C Lee et al. J Theor Biol. .
Free PMC article

Abstract

Mathematical models of cholera and waterborne disease vary widely in their structures, in terms of transmission pathways, loss of immunity, and a range of other features. These differences can affect model dynamics, with different models potentially yielding different predictions and parameter estimates from the same data. Given the increasing use of mathematical models to inform public health decision-making, it is important to assess model distinguishability (whether models can be distinguished based on fit to data) and inference robustness (whether inferences from the model are robust to realistic variations in model structure). In this paper, we examined the effects of uncertainty in model structure in the context of epidemic cholera, testing a range of models with differences in transmission and loss of immunity structure, based on known features of cholera epidemiology. We fit these models to simulated epidemic and long-term data, as well as data from the 2006 Angola epidemic. We evaluated model distinguishability based on fit to data, and whether the parameter values, model behavior, and forecasting ability can accurately be inferred from incidence data. In general, all models were able to successfully fit to all data sets, both real and simulated, regardless of whether the model generating the simulated data matched the fitted model. However, in the long-term data, the best model fits were achieved when the loss of immunity structures matched those of the model that simulated the data. Two parameters, one representing person-to-person transmission and the other representing the reporting rate, were accurately estimated across all models, while the remaining parameters showed broad variation across the different models and data sets. The basic reproduction number (R0) was often poorly estimated even using the correct model, due to practical unidentifiability issues in the waterborne transmission pathway which were consistent across all models. Forecasting efforts using noisy data were not successful early in the outbreaks, but once the epidemic peak had been achieved, most models were able to capture the downward incidence trajectory with similar accuracy. Forecasting from noise-free data was generally successful for all outbreak stages using any model. Our results suggest that we are unlikely to be able to infer mechanistic details from epidemic case data alone, underscoring the need for broader data collection, such as immunity/serology status, pathogen dose response curves, and environmental pathogen data. Nonetheless, with sufficient data, conclusions from forecasting and some parameter estimates were robust to variations in the model structure, and comparative modeling can help to determine how realistic variations in model structure may affect the conclusions drawn from models and data.

Keywords: Comparative modeling; Model misspecification; Model structure; Parameter estimation.

Figures

Figure 1
Figure 1. Diagrams of study models
First row (left to right): Exponential model, Dose Response model, Asymptomatic model. Second row (left to right): Gamma model and Progressive Susceptibility model. Red compartments represent the infected population and red arrows represent person-person transmission. Blue compartments represent pathogen concentration in water while blue arrows represent pathogen-person transmission. Black compartments are susceptible or partially susceptible, while white compartments are immune. Grey arrows indicate pathogen shedding.
Figure 2
Figure 2
Fits to 100-day simulated model data without noise (left column), with normal noise (middle column), and with Poisson noise (right column), using naive starting parameters. Each row represents data simulated by the model labeled at right. Model fits are overlaid, thus obscuring some of the model fits in the figure. All fitting models were able to capture the mean epidemic data despite added noise and model misspecification.
Figure 3
Figure 3
Percent deviation of estimates from true parameter values, grouped by parameter for A) the simulated 100-day data and B) the simulated 3-year data. The model used to fit the data and estimate the parameter is indicated by color. The median across all estimates (i.e., across added noise type, simulation model, and data duration) is marked with a black point in the distribution and the black dashed line represents ±20% deviation. Distribution ranges for the βw, α, and ξ subplots are trimmed for visibility. C) Scatterplot of βW and ξ estimates by colored fitting model. Data points in the distribution tails were truncated for ease of viewing, but see Supplementary Figure S2 for the full plot.
Figure 4
Figure 4
Forecasts from informed starting values to 100-day data (indicated by row) up through 10 days, (left column), 30 days (middle column), and 50 days (right column) with added normal noise. Model fits are overlaid, thus obscuring some of the model fits in the figure.
Figure 5
Figure 5
Parameter estimates derived from naive starting parameter forecasts with 10, 30, and 50 days of data, organized A) by parameter and B) by added noise.
Figure 6
Figure 6
Naive” (left) and “informed” (right) starting parameter fits to 2006 Angola epidemic data.

Similar articles

See all similar articles

Cited by 6 articles

See all "Cited by" articles

Publication types

Feedback