Transparency and Reproducibility of Observational Cohort Studies Using Large Healthcare Databases

Clin Pharmacol Ther. 2016 Mar;99(3):325-32. doi: 10.1002/cpt.329.


The scientific community and decision-makers are increasingly concerned about transparency and reproducibility of epidemiologic studies using longitudinal healthcare databases. We explored the extent to which published pharmacoepidemiologic studies using commercially available databases could be reproduced by other investigators. We identified a nonsystematic sample of 38 descriptive or comparative safety/effectiveness cohort studies. Seven studies were excluded from reproduction, five because of violation of fundamental design principles, and two because of grossly inadequate reporting. In the remaining studies, >1,000 patient characteristics and measures of association were reproduced with a high degree of accuracy (median differences between original and reproduction <2% and <0.1). An essential component of transparent and reproducible research with healthcare databases is more complete reporting of study implementation. Once reproducibility is achieved, the conversation can be elevated to assess whether suboptimal design choices led to avoidable bias and whether findings are replicable in other data sources.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Access to Information*
  • Cohort Studies
  • Databases, Factual*
  • Humans
  • Observational Studies as Topic / standards*
  • Pharmacoepidemiology / standards*
  • Reproducibility of Results