Dynamic categorization of clinical research eligibility criteria by hierarchical clustering
- PMID: 21689783
- PMCID: PMC3183114
- DOI: 10.1016/j.jbi.2011.06.001
Dynamic categorization of clinical research eligibility criteria by hierarchical clustering
Abstract
Objective: To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity.
Design: The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hierarchical clustering and to train supervised classifiers.
Measurements: We induced 27 categories and measured the prevalence of the categories in 27,278 eligibility criteria from 1578 clinical trials and compared the classification performance (i.e., precision, recall, and F1-score) between the UMLS-based feature representation and the "bag of words" feature representation among five common classifiers in Weka, including J48, Bayesian Network, Naïve Bayesian, Nearest Neighbor, and instance-based learning classifier.
Results: The UMLS semantic feature representation outperforms the "bag of words" feature representation in 89% of the criteria categories. Using the semantically induced categories, machine-learning classifiers required only 2000 instances to stabilize classification performance. The J48 classifier yielded the best F1-score and the Bayesian Network classifier achieved the best learning efficiency.
Conclusion: The UMLS is an effective knowledge source and can enable an efficient feature representation for semi-automated semantic category induction and automatic categorization for clinical research eligibility criteria and possibly other clinical text.
Copyright © 2011 Elsevier Inc. All rights reserved.
Figures
Similar articles
-
Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods.BMC Med Inform Decis Mak. 2021 Apr 15;21(1):128. doi: 10.1186/s12911-021-01487-w. BMC Med Inform Decis Mak. 2021. PMID: 33858409 Free PMC article.
-
Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering.AMIA Annu Symp Proc. 2010 Nov 13;2010:487-91. AMIA Annu Symp Proc. 2010. PMID: 21347026 Free PMC article.
-
Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.BMC Med Inform Decis Mak. 2017 Dec 1;17(1):155. doi: 10.1186/s12911-017-0556-8. BMC Med Inform Decis Mak. 2017. PMID: 29191207 Free PMC article.
-
Can Unified Medical Language System-based semantic representation improve automated identification of patient safety incident reports by type and severity?J Am Med Inform Assoc. 2020 Oct 1;27(10):1502-1509. doi: 10.1093/jamia/ocaa082. J Am Med Inform Assoc. 2020. PMID: 32574362 Free PMC article.
-
Machine learning in systematic reviews: Comparing automated text clustering with Lingo3G and human researcher categorization in a rapid review.Res Synth Methods. 2022 Mar;13(2):229-241. doi: 10.1002/jrsm.1541. Epub 2021 Dec 22. Res Synth Methods. 2022. PMID: 34919321 Review.
Cited by
-
Evaluation of Eligibility Criteria Relevance for the Purpose of IT-Supported Trial Recruitment: Descriptive Quantitative Analysis.JMIR Form Res. 2024 Jan 31;8:e49347. doi: 10.2196/49347. JMIR Form Res. 2024. PMID: 38294862 Free PMC article.
-
Implementation of inclusion and exclusion criteria in clinical studies in OHDSI ATLAS software.Sci Rep. 2023 Dec 18;13(1):22457. doi: 10.1038/s41598-023-49560-w. Sci Rep. 2023. PMID: 38105303 Free PMC article.
-
Learning and visualizing chronic latent representations using electronic health records.BioData Min. 2022 Sep 5;15(1):18. doi: 10.1186/s13040-022-00303-z. BioData Min. 2022. PMID: 36064616 Free PMC article.
-
Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning.BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):129. doi: 10.1186/s12911-021-01492-z. BMC Med Inform Decis Mak. 2021. PMID: 34330259 Free PMC article.
-
[Artificial intelligence based Chinese clinical trials eligibility criteria classification].Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2021 Feb 25;38(1):105-110. doi: 10.7507/1001-5515.202006035. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2021. PMID: 33899434 Free PMC article. Chinese.
References
-
- McCray AT. Better Access to Information about Clinical Trials. Annals of Internal Medicine. 2000;133(8):609–614. - PubMed
-
- Sim I, Olasov B, Carini S. An ontology of randomized controlled trials for evidence-based practice: content specification and evaluation using the competency decomposition method. Journal of Biomedical Informatics. 2004;37(2):108–119. - PubMed
-
- Niland J, Cohen E. ASPIRE: agreement on standardized protocol inclusion requirements for eligibility. 2007
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
