Correlating eligibility criteria generalizability and adverse events using Big Data for patients and clinical trials

Ann N Y Acad Sci. 2017 Jan;1387(1):34-43. doi: 10.1111/nyas.13195. Epub 2016 Sep 6.


Randomized controlled trials can benefit from proactive assessment of how well their participant selection strategies during the design of eligibility criteria can influence the study generalizability. In this paper, we present a quantitative metric called generalizability index for study traits 2.0 (GIST 2.0) to assess the a priori generalizability (based on population representativeness) of a clinical trial by accounting for the dependencies among multiple eligibility criteria. The metric was evaluated on 16 sepsis trials identified from, with their adverse event reports extracted from the trial results sections. The correlation between GIST scores and adverse events was analyzed. We found that the GIST 2.0 score was significantly correlated with total adverse events and serious adverse events (weighted correlation coefficients of 0.825 and 0.709, respectively, with P < 0.01). This study exemplifies the promising use of Big Data in electronic health records and for optimizing eligibility criteria design for clinical studies.

Keywords: adverse events; clinical trials; eligibility criteria; generalizability; population representativeness; trait dependencies.

MeSH terms

  • Adult
  • Anti-Infective Agents / adverse effects*
  • Anti-Infective Agents / therapeutic use
  • Clinical Trials as Topic
  • Computational Biology
  • Data Mining
  • Electronic Health Records
  • Humans
  • Patient Selection*
  • Sepsis / drug therapy*
  • Sepsis / immunology
  • Sepsis / microbiology
  • Sepsis / physiopathology
  • Software
  • Systemic Inflammatory Response Syndrome / etiology
  • Systemic Inflammatory Response Syndrome / prevention & control*
  • Translational Research, Biomedical / methods*


  • Anti-Infective Agents