Rapid report on estimating incidence from cross-sectional data

Ann Epidemiol. 2021 Jan;53:106-108.e1. doi: 10.1016/j.annepidem.2020.06.005. Epub 2020 Oct 20.


Purpose: In prospective cohort studies, incidence is typically estimated by the ratio of the observed number of events to person-time at risk. This crude estimator is consistent for the true population incidence rate (IR) under mild assumptions. Here we consider a different setting where only cross-sectional data are available, that is, at a single time point, participants are evaluated to identify whether they have previously had the event of interest.

Methods: Unlike the prospective cohort data setting, for cross-sectional data, the crude IR estimator is biased. Instead, the maximum likelihood estimator (MLE) may be used. Although the MLE does not have a simple closed form, it is consistent and easy to compute using statistical software. To compare the bias of the MLE and the crude estimator, a simulation was conducted.

Results: The crude estimator underestimated the true incidence, whereas the MLE was approximately unbiased. In general, bias of the crude estimator tended to be roughly one to two orders of magnitude larger (in absolute value) than the MLE.

Conclusions: Under cross-sectional data with exact event times unknown, the MLE of the IR is straightforward to calculate, more accurate than the crude IR estimator, and consistent provided the hazard is constant.

Keywords: Bias; Censored data; Cross-sectional; Estimator; Incidence rate; Maximum likelihood.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Cross-Sectional Studies*
  • Humans
  • Incidence*
  • Likelihood Functions
  • Research Design*