Estimates of the basic reproduction number for rubella using seroprevalence data and indicator-based approaches

PLoS Comput Biol. 2022 Mar 3;18(3):e1008858. doi: 10.1371/journal.pcbi.1008858. eCollection 2022 Mar.


The basic reproduction number (R0) of an infection determines the impact of its control. For many endemic infections, R0 is often estimated from appropriate country-specific seroprevalence data. Studies sometimes pool estimates from the same region for settings lacking seroprevalence data, but the reliability of this approach is unclear. Plausibly, indicator-based approaches could predict R0 for such settings. We calculated R0 for rubella for 98 settings and correlated its value against 66 demographic, economic, education, housing and health-related indicators. We also trained a random forest regression algorithm using these indicators as the input and R0 as the output. We used the mean-square error to compare the performances of the random forest, simple linear regression and a regional averaging method in predicting R0 using 4-fold cross validation. R0 was <5, 5-10 and >10 for 81, 14 and 3 settings respectively, with no apparent regional differences and in the limited available data, it was usually lower for rural than urban areas. R0 was most correlated with educational attainment, and household indicators for the Pearson and Spearman correlation coefficients respectively and with poverty-related indicators followed by the crude death rate considering the Maximum Information Coefficient, although the correlation for each was relatively weak (Pearson correlation coefficient: 0.4, 95%CI: (0.24,0.48) for educational attainment). A random forest did not perform better in predicting R0 than simple linear regression, depending on the subsets of training indicators and studies, and neither out-performed a regional averaging approach. R0 for rubella is typically low and using indicators to estimate its value is not straightforward. A regional averaging approach may provide as reliable an estimate of R0 for settings lacking seroprevalence data as one based on indicators. The findings may be relevant for other infections and studies estimating the disease burden and the impact of interventions for settings lacking seroprevalence data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Basic Reproduction Number
  • Humans
  • Reproducibility of Results
  • Rubella* / epidemiology
  • Rural Population
  • Seroepidemiologic Studies

Grants and funding

This work was supported by funding from GAVI, the Vaccine Alliance, via the Vaccine Impact Modelling Consortium (VIMC, VIMC is jointly funded by Gavi, the Vaccine Alliance, and by the Bill Melinda Gates Foundation (BMGF grant number: OPP1157270). This work was carried out as part of the Vaccine Impact Modelling Consortium (, but the views expressed are those of the authors and not necessarily those of the Consortium or its funders. The funders were given the opportunity to review this paper prior to publication, but the final decision on the content of the publication was taken by the authors. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.