Cochrane Centralised Search Service showed high sensitivity identifying randomized controlled trials: A retrospective analysis

J Clin Epidemiol. 2020 Nov;127:142-150. doi: 10.1016/j.jclinepi.2020.08.008. Epub 2020 Aug 13.


Background and objectives: The Cochrane Central Register of Controlled Trials (CENTRAL) is compiled from a number of sources, including PubMed and Embase. Since 2017, we have increased the number of sources feeding into CENTRAL and improved the efficiency of our processes through the use of application programming interfaces, machine learning, and crowdsourcing.Our objectives were twofold: (1) Assess the effectiveness of Cochrane's centralized search and screening processes to correctly identify references to published reports which are eligible for inclusion in Cochrane systematic reviews of randomized controlled trials (RCTs). (2) Identify opportunities to improve the performance of Cochrane's centralized search and screening processes to identify references to eligible trials.

Methods: We identified all references to RCTs (either published journal articles or trial registration records) with a publication or registration date between 1st January 2017 and 31st December 2018 that had been included in a Cochrane intervention review. We then viewed an audit trail for each included reference to determine if it had been identified by our centralized search process and subsequently added to CENTRAL.

Results: We identified 650 references to included studies with a publication year of 2017 or 2018. Of those, 634 (97.5%) had been captured by Cochrane's Centralised Search Service. Sixteen references had been missed by the Cochrane's Centralised Search Service: six had PubMed-not-MEDLINE status, four were missed by the centralized Embase search, three had been misclassified by Cochrane Crowd, one was from a journal not indexed in MEDLINE or Embase, one had only been added to Embase in 2019, and one reference had been rejected by the automated RCT machine learning classifier. Of the sixteen missed references, eight were the main or only publication to the trial in the review in which it had been included.

Conclusion: This analysis has shown that Cochrane's centralized search and screening processes are highly sensitive. It has also helped us to understand better why some references to eligible RCTs have been missed. The CSS is playing a critical role in helping to populate CENTRAL and is moving us toward making CENTRAL a comprehensive repository of RCTs.

Keywords: Cochrane central register of controlled trials; Crowdsourcing; Evidence synthesis; Information retrieval; Machine learning; Methodological filter; Randomized controlled trial; Systematic review.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Crowdsourcing / statistics & numerical data
  • Data Aggregation
  • Databases, Bibliographic* / statistics & numerical data
  • Humans
  • Information Storage and Retrieval / methods*
  • Information Storage and Retrieval / statistics & numerical data
  • Machine Learning
  • PubMed
  • Randomized Controlled Trials as Topic*
  • Registries* / statistics & numerical data
  • Retrospective Studies
  • Sensitivity and Specificity
  • Systematic Reviews as Topic*