Sample selection and validity of exposure-disease association estimates in cohort studies

J Epidemiol Community Health. 2011 May;65(5):407-11. doi: 10.1136/jech.2009.107185. Epub 2010 Sep 29.


Background: Participants in cohort studies are frequently selected from restricted source populations. It has been recognised that such restriction may affect the study validity.

Objectives: To assess the bias that may arise when analyses involve data from cohorts based on restricted source populations, an area little studied in quantitative terms.

Methods: Monte Carlo simulations were used, based on a setting where the exposure and one risk factor for the outcome, which are not associated in the general population, influence selection into the cohort. All the parameters involved in the simulations (ie, prevalence and effects of exposure and risk factor on both the selection and outcome process, selection prevalence, baseline outcome incidence rate, and sample size) were allowed to vary to reflect real life settings.

Results: The simulations show that when the exposure and risk factor are strongly associated with selection (ORs of 4 or 0.25) and the unmeasured risk factor is associated with a disease HR of 4, the bias in the estimated log HR for the exposure-disease association is ±0.15. When these associations decrease to values more commonly seen in epidemiological studies (eg, ORs and HRs of 2 or 0.5), the bias in the log HR drops to just ±0.02.

Conclusions: Using a restricted source population for a cohort study will, under a range of sensible scenarios, produce only relatively weak bias in estimates of the exposure-disease associations.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Cohort Studies
  • Environmental Exposure / adverse effects*
  • Environmental Exposure / statistics & numerical data
  • Epidemiologic Methods*
  • Humans
  • Incidence
  • Logistic Models
  • Monte Carlo Method
  • Prevalence
  • Risk Assessment / methods
  • Risk Factors
  • Sample Size
  • Selection Bias*