Survey inference for subpopulations

Am J Epidemiol. 1996 Jul 1;144(1):102-6. doi: 10.1093/oxfordjournals.aje.a008847.


One frequently analyzes a subset of the data collected in a survey when interest focuses on individuals in a certain subpopulation of the sampled population. Although it may seem natural to eliminate from the data set all data from individuals outside the subpopulation before analysis, this procedure may yield incorrect standard errors and confidence intervals. The authors give two examples of this using data from the 1987 National Health Interview Survey and the 1986 National Mortality Followback Survey. The correct method of analysis is described, as well as a simple condition that, when satisfied, ensures that the elimination approach yields identical answers to the correct method.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Analysis of Variance
  • Bias
  • Confidence Intervals
  • Data Interpretation, Statistical*
  • Digestive System Neoplasms / mortality
  • Female
  • Health Surveys*
  • Humans
  • Male
  • Middle Aged
  • Regression Analysis
  • Reproducibility of Results
  • Research Design
  • Sampling Studies*
  • United States / epidemiology