The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials

Stat Med. 2007 Jan 15;26(1):20-36. doi: 10.1002/sim.2739.


For estimating causal effects of treatments, randomized experiments are generally considered the gold standard. Nevertheless, they are often infeasible to conduct for a variety of reasons, such as ethical concerns, excessive expense, or timeliness. Consequently, much of our knowledge of causal effects must come from non-randomized observational studies. This article will advocate the position that observational studies can and should be designed to approximate randomized experiments as closely as possible. In particular, observational studies should be designed using only background information to create subgroups of similar treated and control units, where 'similar' here refers to their distributions of background variables. Of great importance, this activity should be conducted without any access to any outcome data, thereby assuring the objectivity of the design. In many situations, this objective creation of subgroups of similar treated and control units, which are balanced with respect to covariates, can be accomplished using propensity score methods. The theoretical perspective underlying this position will be presented followed by a particular application in the context of the US tobacco litigation. This application uses propensity score methods to create subgroups of treated units (male current smokers) and control units (male never smokers) who are at least as similar with respect to their distributions of observed background characteristics as if they had been randomized. The collection of these subgroups then 'approximate' a randomized block experiment with respect to the observed covariates.

Publication types

  • Historical Article

MeSH terms

  • Biometry / history
  • Causality*
  • Data Interpretation, Statistical
  • History, 20th Century
  • History, 21st Century
  • Humans
  • Jurisprudence
  • Male
  • Models, Statistical*
  • Randomized Controlled Trials as Topic / history
  • Randomized Controlled Trials as Topic / statistics & numerical data*
  • Smoking / adverse effects
  • United States