Differential diagnosis generators: an evaluation of currently available computer programs
- PMID: 21789717
- PMCID: PMC3270234
- DOI: 10.1007/s11606-011-1804-8
Differential diagnosis generators: an evaluation of currently available computer programs
Abstract
Background: Differential diagnosis (DDX) generators are computer programs that generate a DDX based on various clinical data.
Objective: We identified evaluation criteria through consensus, applied these criteria to describe the features of DDX generators, and tested performance using cases from the New England Journal of Medicine (NEJM©) and the Medical Knowledge Self Assessment Program (MKSAP©).
Methods: We first identified evaluation criteria by consensus. Then we performed Google® and Pubmed searches to identify DDX generators. To be included, DDX generators had to do the following: generate a list of potential diagnoses rather than text or article references; rank or indicate critical diagnoses that need to be considered or eliminated; accept at least two signs, symptoms or disease characteristics; provide the ability to compare the clinical presentations of diagnoses; and provide diagnoses in general medicine. The evaluation criteria were then applied to the included DDX generators. Lastly, the performance of the DDX generators was tested with findings from 20 test cases. Each case performance was scored one through five, with a score of five indicating presence of the exact diagnosis. Mean scores and confidence intervals were calculated.
Key results: Twenty three programs were initially identified and four met the inclusion criteria. These four programs were evaluated using the consensus criteria, which included the following: input method; mobile access; filtering and refinement; lab values, medications, and geography as diagnostic factors; evidence based medicine (EBM) content; references; and drug information content source. The mean scores (95% Confidence Interval) from performance testing on a five-point scale were Isabel© 3.45 (2.53, 4.37), DxPlain® 3.45 (2.63-4.27), Diagnosis Pro® 2.65 (1.75-3.55) and PEPID™ 1.70 (0.71-2.69). The number of exact matches paralleled the mean score finding.
Conclusions: Consensus criteria for DDX generator evaluation were developed. Application of these criteria as well as performance testing supports the use of DxPlain® and Isabel© over the other currently available DDX generators.
Similar articles
-
Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin.BMC Med Inform Decis Mak. 2022 Sep 24;22(1):254. doi: 10.1186/s12911-022-01988-2. BMC Med Inform Decis Mak. 2022. PMID: 36153527 Free PMC article.
-
Effects of Combinational Use of Additional Differential Diagnostic Generators on the Diagnostic Accuracy of the Differential Diagnosis List Developed by an Artificial Intelligence-Driven Automated History-Taking System: Pilot Cross-Sectional Study.JMIR Form Res. 2023 Aug 2;7:e49034. doi: 10.2196/49034. JMIR Form Res. 2023. PMID: 37531164 Free PMC article.
-
The Effectiveness of Electronic Differential Diagnoses (DDX) Generators: A Systematic Review and Meta-Analysis.PLoS One. 2016 Mar 8;11(3):e0148991. doi: 10.1371/journal.pone.0148991. eCollection 2016. PLoS One. 2016. PMID: 26954234 Free PMC article. Review.
-
An approach to evaluating the accuracy of DXplain.Comput Methods Programs Biomed. 1991 Aug;35(4):261-6. doi: 10.1016/0169-2607(91)90004-d. Comput Methods Programs Biomed. 1991. PMID: 1752121
-
Recommendations from the international evidence-based guideline for the assessment and management of polycystic ovary syndrome.Fertil Steril. 2018 Aug;110(3):364-379. doi: 10.1016/j.fertnstert.2018.05.004. Epub 2018 Jul 19. Fertil Steril. 2018. PMID: 30033227 Free PMC article. Review.
Cited by
-
Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: Cross-Sectional Study.JMIR Med Inform. 2024 Oct 2;12:e63010. doi: 10.2196/63010. JMIR Med Inform. 2024. PMID: 39357052 Free PMC article.
-
Diagnostic performance of generative artificial intelligences for a series of complex case reports.Digit Health. 2024 Jul 21;10:20552076241265215. doi: 10.1177/20552076241265215. eCollection 2024 Jan-Dec. Digit Health. 2024. PMID: 39229463 Free PMC article.
-
Evaluation of large language models as a diagnostic aid for complex medical cases.Front Med (Lausanne). 2024 Jun 20;11:1380148. doi: 10.3389/fmed.2024.1380148. eCollection 2024. Front Med (Lausanne). 2024. PMID: 38966538 Free PMC article.
-
Evaluating ChatGPT-4's Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic Cases.JMIR Form Res. 2024 Jun 26;8:e59267. doi: 10.2196/59267. JMIR Form Res. 2024. PMID: 38924784 Free PMC article.
-
Opportunities for the use of large language models in hepatology.Clin Liver Dis (Hoboken). 2023 Sep 13;22(5):171-176. doi: 10.1097/CLD.0000000000000075. eCollection 2023 Nov. Clin Liver Dis (Hoboken). 2023. PMID: 38026124 Free PMC article. No abstract available.
References
-
- CRICO Harvard Risk Management Foundation. High Risk Areas: 26% of claims are in the category of diagnosis. Accessed May 30th, 2011, at http://www.rmf.harvard.edu/high-risk-areas.
-
- Croskerry P. Clinical cognition and diagnostic error: applications of a dual process model of reasoning Advances In Health Sciences Education. Theory And Practice. 2009;14(Suppl 1):27–35. - PubMed
-
- Schiff, G. D., Kim, S., Abrams, R., Cosby, K., Lambert, B. L., Elstein, A. S., Hasler, S., et al., Diagnosing Diagnosis Errors: Lessons from a Multi-institutional Collaborative Project. Advances in Patient Safety: From Research to Implementation. Volumes 2, AHRQ Publication Nos. 050021 (Vols 1–4). February 2005. Agency for Healthcare Research and Quality, Rockville, MD. Accessed May 30, 2011, at http://www.ahrq.gov/qual/advances/. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
