An evaluation of Cochrane Crowd found that crowdsourcing produced accurate results in identifying randomized trials

J Clin Epidemiol. 2021 May;133:130-139. doi: 10.1016/j.jclinepi.2021.01.006. Epub 2021 Jan 18.


Background and objectives: Filtering the deluge of new research to facilitate evidence synthesis has proven to be unmanageable using current paradigms of search and retrieval. Crowdsourcing, a way of harnessing the collective effort of a "crowd" of people, has the potential to support evidence synthesis by addressing this information overload created by the exponential growth in primary research outputs. Cochrane Crowd, Cochrane's citizen science platform, offers a range of tasks aimed at identifying studies related to health care. Accompanying each task are brief, interactive training modules, and agreement algorithms that help ensure accurate collective decision-making.The aims of the study were to evaluate the performance of Cochrane Crowd in terms of its accuracy, capacity, and autonomy and to examine contributor engagement across three tasks aimed at identifying randomized trials.

Study design and setting: Crowd accuracy was evaluated by measuring the sensitivity and specificity of crowd screening decisions on a sample of titles and abstracts, compared with "quasi gold-standard" decisions about the same records using the conventional methods of dual screening. Crowd capacity, in the form of output volume, was evaluated by measuring the number of records processed by the crowd, compared with baseline. Crowd autonomy, the capability of the crowd to produce accurate collectively derived decisions without the need for expert resolution, was measured by the proportion of records that needed resolving by an expert.

Results: The Cochrane Crowd community currently has 18,897 contributors from 163 countries. Collectively, the Crowd has processed 1,021,227 records, helping to identify 178,437 reports of randomized controlled trials (RCTs) for Cochrane's Central Register of Controlled Trials. The sensitivity for each task was 99.1% for the RCT identification task (RCT ID), 99.7% for the RCT identification task of trials from (CT ID), and 97.7% for the identification of RCTs from the International Clinical Trials Registry Platform (ICTRP ID). The specificity for each task was 99% for RCT ID, 98.6% for CT ID, and 99.1% for CT ICTRP ID. The capacity of the combined Crowd and machine learning workflow has increased fivefold in 6 years, compared with baseline. The proportion of records requiring expert resolution across the tasks ranged from 16.6% to 19.7%.

Conclusion: Cochrane Crowd is sufficiently accurate and scalable to keep pace with the current rate of publication (and registration) of new primary studies. It has also proved to be a popular, efficient, and accurate way for a large number of people to play an important voluntary role in health evidence production. Cochrane Crowd is now an established part of Cochrane's effort to manage the deluge of primary research being produced.

Keywords: Citizen science; Cochrane; Crowdsourcing; Evidence production; Human intelligence tasking; Information management; Machine learning; Randomized controlled trial; Screening; Systematic review.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms
  • Biomedical Research / methods*
  • Biomedical Research / standards*
  • Biomedical Research / statistics & numerical data
  • Crowdsourcing / methods*
  • Crowdsourcing / standards*
  • Crowdsourcing / statistics & numerical data
  • Female
  • Humans
  • Male
  • Middle Aged
  • Patient Selection*
  • Randomized Controlled Trials as Topic / methods*
  • Randomized Controlled Trials as Topic / standards*
  • Randomized Controlled Trials as Topic / statistics & numerical data
  • Sensitivity and Specificity