Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
- PMID: 31727159
- PMCID: PMC6857277
- DOI: 10.1186/s13643-019-1221-3
Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
Abstract
Background: Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool.
Methods: We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel. Each team trained DistillerAI with 300 randomly selected abstracts that the team screened dually. For all remaining abstracts, DistillerAI replaced one human screener and provided predictions about the relevance of records. A single reviewer also screened all remaining abstracts. A second human screener resolved conflicts between the single reviewer and DistillerAI. We compared the decisions of the machine-assisted approach, single-reviewer screening, and screening with DistillerAI alone against the reference standard.
Results: The combined sensitivity of the machine-assisted screening approach across the five screening teams was 78% (95% confidence interval [CI], 66 to 90%), and the combined specificity was 95% (95% CI, 92 to 97%). By comparison, the sensitivity of single-reviewer screening was similar (78%; 95% CI, 66 to 89%); however, the sensitivity of DistillerAI alone was substantially worse (14%; 95% CI, 0 to 31%) than that of the machine-assisted screening approach. Specificities for single-reviewer screening and DistillerAI were 94% (95% CI, 91 to 97%) and 98% (95% CI, 97 to 100%), respectively. Machine-assisted screening and single-reviewer screening had similar areas under the curve (0.87 and 0.86, respectively); by contrast, the area under the curve for DistillerAI alone was just slightly better than chance (0.56). The interrater agreement between human screeners and DistillerAI with a prevalence-adjusted kappa was 0.85 (95% CI, 0.84 to 0.86%).
Conclusions: The accuracy of DistillerAI is not yet adequate to replace a human screener temporarily during abstract screening for systematic reviews. Rapid reviews, which do not require detecting the totality of the relevant evidence, may find semi-automation tools to have greater utility than traditional systematic reviews.
Keywords: Accuracy; Machine-learning; Methods study; Rapid reviews; Systematic reviews.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
Assessing the Accuracy of Machine-Assisted Abstract Screening With DistillerAI: A User Study [Internet].Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Nov. Report No.: 19(20)-EHC026-EF. Rockville (MD): Agency for Healthcare Research and Quality (US); 2019 Nov. Report No.: 19(20)-EHC026-EF. PMID: 31804782 Free Books & Documents. Review.
-
Performance and usability of machine learning for screening in systematic reviews: a comparative evaluation of three tools.Syst Rev. 2019 Nov 15;8(1):278. doi: 10.1186/s13643-019-1222-2. Syst Rev. 2019. PMID: 31727150 Free PMC article.
-
Single-reviewer abstract screening missed 13 percent of relevant studies: a crowd-based, randomized controlled trial.J Clin Epidemiol. 2020 May;121:20-28. doi: 10.1016/j.jclinepi.2020.01.005. Epub 2020 Jan 21. J Clin Epidemiol. 2020. PMID: 31972274
-
Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow.Syst Rev. 2021 May 26;10(1):156. doi: 10.1186/s13643-021-01700-x. Syst Rev. 2021. PMID: 34039433 Free PMC article.
-
Inter-reviewer reliability of human literature reviewing and implications for the introduction of machine-assisted systematic reviews: a mixed-methods review.BMJ Open. 2024 Mar 19;14(3):e076912. doi: 10.1136/bmjopen-2023-076912. BMJ Open. 2024. PMID: 38508610 Free PMC article. Review.
Cited by
-
Semi-automated title-abstract screening using natural language processing and machine learning.Syst Rev. 2024 Nov 1;13(1):274. doi: 10.1186/s13643-024-02688-w. Syst Rev. 2024. PMID: 39487499 Free PMC article.
-
An exploration of available methods and tools to improve the efficiency of systematic review production: a scoping review.BMC Med Res Methodol. 2024 Sep 18;24(1):210. doi: 10.1186/s12874-024-02320-4. BMC Med Res Methodol. 2024. PMID: 39294580 Free PMC article. Review.
-
Human-Comparable Sensitivity of Large Language Models in Identifying Eligible Studies Through Title and Abstract Screening: 3-Layer Strategy Using GPT-3.5 and GPT-4 for Systematic Reviews.J Med Internet Res. 2024 Aug 16;26:e52758. doi: 10.2196/52758. J Med Internet Res. 2024. PMID: 39151163 Free PMC article.
-
Rapid review methods series: Guidance on the use of supportive software.BMJ Evid Based Med. 2024 Jul 23;29(4):264-271. doi: 10.1136/bmjebm-2023-112530. BMJ Evid Based Med. 2024. PMID: 38242566 Free PMC article.
-
Machine Learning Methods for Systematic Reviews:: A Rapid Scoping Review.Dela J Public Health. 2023 Nov 30;9(4):40-47. doi: 10.32481/djph.2023.11.008. eCollection 2023 Nov. Dela J Public Health. 2023. PMID: 38173960 Free PMC article.
References
-
- Effective Health Care Program . Methods guide for effectiveness and comparative effectiveness reviews. Rockville: Agency for Healthcare Research and Quality; 2014. - PubMed
-
- Methods Group of the Campbell Collaboration. Methodological expectations of Campbell Collaboration intervention reviews: conduct standards. Campbell Policies and Guidelines Series No. 3 Oslo, Norway: Campbell Collaboration; 2017 [Available from: https://www.campbellcollaboration.org/library/campbell-methods-conduct-s...]. Accessed 11 Nov 2019.
-
- Institute of Medicine of the National Academies . Finding what works in health care: standards for systematic reviews. Washington, DC: Institute of Medicine of the National Academies; 2011. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
