Generalizing Treatment Effect Estimates From Sample to Population: A Case Study in the Difficulties of Finding Sufficient Data

Eval Rev. 2017 Aug;41(4):357-388. doi: 10.1177/0193841X16660663. Epub 2016 Aug 4.


Background: Given increasing concerns about the relevance of research to policy and practice, there is growing interest in assessing and enhancing the external validity of randomized trials: determining how useful a given randomized trial is for informing a policy question for a specific target population.

Objectives: This article highlights recent advances in assessing and enhancing external validity, with a focus on the data needed to make ex post statistical adjustments to enhance the applicability of experimental findings to populations potentially different from their study sample.

Research design: We use a case study to illustrate how to generalize treatment effect estimates from a randomized trial sample to a target population, in particular comparing the sample of children in a randomized trial of a supplemental program for Head Start centers (the Research-Based, Developmentally Informed study) to the national population of children eligible for Head Start, as represented in the Head Start Impact Study.

Results: For this case study, common data elements between the trial sample and population were limited, making reliable generalization from the trial sample to the population challenging.

Conclusions: To answer important questions about external validity, more publicly available data are needed. In addition, future studies should make an effort to collect measures similar to those in other data sets. Measure comparability between population data sets and randomized trials that use samples of convenience will greatly enhance the range of research and policy relevant questions that can be answered.

Keywords: Head Start Impact Study; REDI evaluation; causal inference; generalizability; transportability.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Child, Preschool
  • Data Collection*
  • Early Intervention, Educational
  • Female
  • Humans
  • Male
  • Policy Making
  • Program Evaluation*
  • Propensity Score
  • Randomized Controlled Trials as Topic*
  • Reproducibility of Results*