Development and evaluation of a common data model enabling active drug safety surveillance using disparate healthcare databases

J Am Med Inform Assoc. Nov-Dec 2010;17(6):652-62. doi: 10.1136/jamia.2009.002477.


Objective: Active drug safety surveillance may be enhanced by analysis of multiple observational healthcare databases, including administrative claims and electronic health records. The objective of this study was to develop and evaluate a common data model (CDM) enabling rapid, comparable, systematic analyses across disparate observational data sources to identify and evaluate the effects of medicines.

Design: The CDM uses a person-centric design, with attributes for demographics, drug exposures, and condition occurrence. Drug eras, constructed to represent periods of persistent drug use, are derived from available elements from pharmacy dispensings, prescriptions written, and other medication history. Condition eras aggregate diagnoses that occur within a single episode of care. Drugs and conditions from source data are mapped to biomedical ontologies to standardize terminologies and enable analyses of higher-order effects.

Measurements: The CDM was applied to two source types: an administrative claims and an electronic medical record database. Descriptive statistics were used to evaluate transformation rules. Two case studies demonstrate the ability of the CDM to enable standard analyses across disparate sources: analyses of persons exposed to rofecoxib and persons with an acute myocardial infarction.

Results: Over 43 million persons, with nearly 1 billion drug exposures and 3.7 billion condition occurrences from both databases were successfully transformed into the CDM. An analysis routine applied to transformed data from each database produced consistent, comparable results.

Conclusion: A CDM can normalize the structure and content of disparate observational data, enabling standardized analyses that are meaningfully comparable when assessing the effects of medicines.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Child
  • Cyclooxygenase 2 Inhibitors / adverse effects
  • Data Mining / methods*
  • Drug Information Services*
  • Female
  • Humans
  • Information Systems*
  • Lactones / adverse effects
  • Male
  • Middle Aged
  • Models, Theoretical
  • Myocardial Infarction / chemically induced
  • Product Surveillance, Postmarketing*
  • Reproducibility of Results
  • Sulfones / adverse effects
  • Systems Integration*
  • United States
  • Vocabulary, Controlled


  • Cyclooxygenase 2 Inhibitors
  • Lactones
  • Sulfones
  • rofecoxib