Using predictive analytics to improve pragmatic trial design

Clin Trials. 2020 Aug;17(4):394-401. doi: 10.1177/1740774520910367. Epub 2020 Mar 10.


Clinical trials embedded in health systems can randomize large populations using automated data sources to determine trial eligibility and assess outcomes. The suicide prevention outreach trial used real-world data for trial design and randomized 18,868 individuals in four health systems using patient-reported thoughts of death or self-harm (Patient Health Questionnaire item 9). This took 3.5 years. We consider if using predictive analytics, that is, suicide risk estimates based on prediction models, could improve trial "efficiency." We used data on mental health outpatient visits between 1 January 2009 and 30 September 2017 in seven health systems (HealthPartners; Henry Ford Health System; and Colorado, Hawaii, Northwest, Southern California, and Washington Kaiser Permanente regions). We used a suicide risk prediction model developed in these same systems. We compared five trial designs with different eligibility criteria: a response of a 2 or 3 on Patient Health Questionnaire item 9, a response of a 3, suicide risk score above 90th, 95th, or 99th percentile. We compared the sample that met each criterion, 90-day suicide attempt rate following first eligible visit, and necessary sample sizes to detect a 15%, 25%, and 35% relative reduction in the suicide attempt rate, assuming 90% power, for each eligibility criterion. Our sample included 24,355,599 outpatient visits. Despite wide-spread use of Patient Health Questionnaire, 21,026,985 (86.3%) visits did not have a recorded Patient Health Questionnaire. Of the 2,928,927 individuals in our sample, 109,861 had a recorded Patient Health Questionnaire item 9 response of a 2 or 3 over the study years with a 1.40% 90-day suicide attempt rate and 50,047 had a response of a 3 (suicide attempt rate 1.98%). More patients met criteria requiring a certain risk score or higher: 331,273 had a 90th percentile risk score or higher (suicide attempt rate: 1.36%); 182,316 a 95th percentile or higher (suicide attempt rate 2.16%), and 78,655 a 99th percentile or higher (suicide attempt rate: 3.95%). Eligibility criterion of a Patient Health Questionnaire item 9 response of a 2 or 3 would require randomizing 44,081 individuals (40.2% of eligible population in our sample); eligibility criterion of a 3 would require 31,024 individuals (62.0% of eligible population). Eligibility criterion of a suicide risk score of 90th percentile or higher would require 45,675 individuals (13.8% of eligible population), 95th percentile 28,699 individuals (15.7% of eligible population), and 99th percentile 15,509 (19.7% of eligible population). A suicide risk prediction calculator could improve trial "efficiency"; identifying more individuals at increased suicide risk than relying on patient-report. It is an open scientific question if individuals identified using predictive analytics would respond differently to interventions than those identified by more traditional means.

Keywords: Study design; embedded trials; mental health; power calculations; pragmatic trials; predictive analytics; randomized trial design; samples size calculations; suicide prevention.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Electronic Health Records
  • Eligibility Determination / statistics & numerical data
  • Female
  • Humans
  • Male
  • Mental Health / statistics & numerical data
  • Middle Aged
  • Pragmatic Clinical Trials as Topic / methods*
  • Randomized Controlled Trials as Topic / methods
  • Research Design*
  • Risk Assessment / methods*
  • Risk Factors
  • Sample Size
  • Suicidal Ideation
  • Suicide / prevention & control*
  • Suicide / statistics & numerical data
  • Suicide, Attempted / prevention & control
  • Suicide, Attempted / statistics & numerical data
  • Surveys and Questionnaires
  • Young Adult