Differential recall bias and spurious associations in case/control studies

Stat Med. 1996 Dec 15;15(23):2603-16. doi: 10.1002/(SICI)1097-0258(19961215)15:23<2603::AID-SIM371>3.0.CO;2-G.


Consider a case/control study designed to investigate a possible association between exposure to a putative risk factor and development of a particular disease. Let E denote the information required to specify a subject's exposure to the risk factor. We examine the effect that errors in the recorded values of E (which we denote by E*) have on inferences of an association between disease and the risk factor. We concentrate on situations where the errors in recorded exposure are such that exposure is underestimated for controls and overestimated for cases. This phenomenon is referred to as differential recall bias and may lead to spurious inferences of an association between exposure and disease. We describe how the standard inferential techniques used in the analysis of data from case/control studies may be adjusted to take account of specified mechanisms whereby E is distorted to produce E*. Such adjustments may be used to determine the sensitivity of an analysis to the phenomenon of differential recall bias and to quantify the extent of such bias that would be required to overturn the conclusions of the analysis. There remains the matter of judging whether a given distortion mechanism is reasonable in a particular context. This emphasizes the need for investigators to take account of differential recall bias in validation studies of exposure assessment techniques. The methodology developed here is applied to a recent major study investigating the possible association between lung cancer and exposure to environmental tobacco smoke. The log-odds ratio of 0.23 based on recorded exposure differs significantly from 0 (P < 0.02). However, the association is rendered non-significant by a very modest degree of differential recall bias. For example, if 3.8 per cent of exposed controls report no exposure, 3.8 per cent of unexposed cases report exposure, and all other subjects report exposure accurately, the log-odds ratio drops to 0.07 and the corresponding p-value increases to 0.49.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Bias
  • Case-Control Studies*
  • Child
  • Cotinine / blood
  • Cross-Sectional Studies
  • Environmental Exposure / adverse effects
  • Environmental Monitoring
  • Epidemiological Monitoring
  • Female
  • Humans
  • Likelihood Functions
  • Logistic Models*
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / etiology
  • Middle Aged
  • Odds Ratio
  • Population Surveillance
  • Pregnancy
  • Prenatal Exposure Delayed Effects
  • Research Design
  • Risk Assessment*
  • Risk Factors
  • Tobacco Smoke Pollution / adverse effects


  • Tobacco Smoke Pollution
  • Cotinine