Statistical rules for safety monitoring in clinical trials

Clin Trials. 2024 Apr;21(2):152-161. doi: 10.1177/17407745231203391. Epub 2023 Oct 25.

Abstract

Background/aims: Protecting patient safety is an essential component of the conduct of clinical trials. Rigorous safety monitoring schemes are implemented for these studies to guard against excess toxicity risk from study therapies. They often include protocol-specified stopping rules dictating that an excessive number of safety events will trigger a halt of the study. Statistical methods are useful for constructing rules that protect patients from exposure to excessive toxicity while also maintaining the chance of a false safety signal at a low level. Several statistical techniques have been proposed for this purpose, but the current literature lacks a rigorous comparison to determine which method may be best suitable for a given trial design. The aims of this article are (1) to describe a general framework for repeated monitoring of safety events in clinical trials; (2) to survey common statistical techniques for creating safety stopping criteria; and (3) to provide investigators with a software tool for constructing and assessing these stopping rules.

Methods: The properties and operating characteristics of stopping rules produced by Pocock and O'Brien-Fleming tests, Bayesian Beta-Binomial models, and sequential probability ratio tests (SPRTs) are studied and compared for common scenarios that may arise in phase II and III trials. We developed the R package "stoppingrule" for constructing and evaluating stopping rules from these methods. Its usage is demonstrated through a redesign of a stopping rule for BMT CTN 0601 (registered at Clinicaltrials.gov as NCT00745420), a phase II, single-arm clinical trial that evaluated outcomes in pediatric sickle cell disease patients treated by bone marrow transplant.

Results: Methods with aggressive stopping criteria early in the trial, such as the Pocock test and Bayesian Beta-Binomial models with weak priors, have permissive stopping criteria at late stages. This results in a trade-off where rules with aggressive early monitoring generally will have a smaller number of expected toxicities but also lower power than rules with more conservative early stopping, such as the O-Brien-Fleming test and Beta-Binomial models with strong priors. The modified SPRT method is sensitive to the choice of alternative toxicity rate. The maximized SPRT generally has a higher number of expected toxicities and/or worse power than other methods.

Conclusions: Because the goal is to minimize the number of patients exposed to and experiencing toxicities from an unsafe therapy, we recommend using the Pocock or Beta-Binomial, weak prior methods for constructing safety stopping rules. At the design stage, the operating characteristics of candidate rules should be evaluated under various possible toxicity rates in order to guide the choice of rule(s) for a given trial; our R package facilitates this evaluation.

Keywords: R Package; Toxicity monitoring; blood and marrow transplant; sequential testing; sickle cell disease.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Child
  • Humans
  • Models, Statistical*
  • Outcome Assessment, Health Care
  • Probability
  • Research Design*

Associated data

  • ClinicalTrials.gov/NCT00745420