Methodological issues of retrospective surveys for measuring mortality of highly clustered diseases: case study of the 2014-16 Ebola outbreak in Bo District, Sierra Leone

Glob Health Action. 2024 Dec 31;17(1):2331291. doi: 10.1080/16549716.2024.2331291. Epub 2024 Apr 26.


Background: There is a lack of empirical data on design effects (DEFF) for mortality rate for highly clustered data such as with Ebola virus disease (EVD), along with a lack of documentation of methodological limitations and operational utility of mortality estimated from cluster-sampled studies when the DEFF is high.

Objectives: The objectives of this paper are to report EVD mortality rate and DEFF estimates, and discuss the methodological limitations of cluster surveys when data are highly clustered such as during an EVD outbreak.

Methods: We analysed the outputs of two independent population-based surveys conducted at the end of the 2014-2016 EVD outbreak in Bo District, Sierra Leone, in urban and rural areas. In each area, 35 clusters of 14 households were selected with probability proportional to population size. We collected information on morbidity, mortality and changes in household composition during the recall period (May 2014 to April 2015). Rates were calculated for all-cause, all-age, under-5 and EVD-specific mortality, respectively, by areas and overall. Crude and adjusted mortality rates were estimated using Poisson regression, accounting for the surveys sample weights and the clustered design.

Results: Overall 980 households and 6,522 individuals participated in both surveys. A total of 64 deaths were reported, of which 20 were attributed to EVD. The crude and EVD-specific mortality rates were 0.35/10,000 person-days (95%CI: 0.23-0.52) and 0.12/10,000 person-days (95%CI: 0.05-0.32), respectively. The DEFF for EVD mortality was 5.53, and for non-EVD mortality, it was 1.53. DEFF for EVD-specific mortality was 6.18 in the rural area and 0.58 in the urban area. DEFF for non-EVD-specific mortality was 1.87 in the rural area and 0.44 in the urban area.

Conclusion: Our findings demonstrate a high degree of clustering; this contributed to imprecise mortality estimates, which have limited utility when assessing the impact of disease. We provide DEFF estimates that can inform future cluster surveys and discuss design improvements to mitigate the limitations of surveys for highly clustered data.

Keywords: Ebola virus disease; cluster surveys; design effects; highly-clustered data; mortality.

Plain language summary

Main findings: For humanitarian organizations it is imperative to document the methodological limitations of cluster surveys and discuss the utility.Added knowledge: This paper adds new knowledge on cluster surveys for highly clustered data such us in Ebola virus disease.Global health impact of policy and action: We provided empirical estimates and discuss design improvements to inform future study.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Child
  • Child, Preschool
  • Cluster Analysis
  • Disease Outbreaks*
  • Female
  • Hemorrhagic Fever, Ebola* / epidemiology
  • Hemorrhagic Fever, Ebola* / mortality
  • Humans
  • Infant
  • Male
  • Middle Aged
  • Retrospective Studies
  • Rural Population / statistics & numerical data
  • Sierra Leone / epidemiology
  • Surveys and Questionnaires
  • Urban Population
  • Young Adult

Grants and funding

Médecins sans Frontières (MSF) provided funding for this study. HAW was funded by the UK Medical Research Council (MRC) and the UK Department for International Development (DFID) under the MRC/DFID Concordat agreement, which is part of the EDCTP2 programme supported by the European Union. Grant Ref: MR/R010161/1