Large-scale automated machine reading discovers new cancer-driving mechanisms
- PMID: 30256986
- PMCID: PMC6156821
- DOI: 10.1093/database/bay098
Large-scale automated machine reading discovers new cancer-driving mechanisms
Abstract
PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
Figures
Similar articles
-
[Machine Learning Applications in Cancer Genome Medicine].Gan To Kagaku Ryoho. 2019 Mar;46(3):423-426. Gan To Kagaku Ryoho. 2019. PMID: 30914576 Japanese.
-
Automated ontology generation framework powered by linked biomedical ontologies for disease-drug domain.Comput Methods Programs Biomed. 2018 Oct;165:117-128. doi: 10.1016/j.cmpb.2018.08.010. Epub 2018 Aug 16. Comput Methods Programs Biomed. 2018. PMID: 30337066
-
A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images.Comput Methods Programs Biomed. 2018 Oct;164:15-22. doi: 10.1016/j.cmpb.2018.05.034. Epub 2018 Jun 26. Comput Methods Programs Biomed. 2018. PMID: 30195423 Review.
-
PyBDA: a command line tool for automated analysis of big biological data sets.BMC Bioinformatics. 2019 Nov 12;20(1):564. doi: 10.1186/s12859-019-3087-8. BMC Bioinformatics. 2019. PMID: 31718539 Free PMC article.
-
Machine Learning Approaches in Cardiovascular Imaging.Circ Cardiovasc Imaging. 2017 Oct;10(10):e005614. doi: 10.1161/CIRCIMAGING.117.005614. Circ Cardiovasc Imaging. 2017. PMID: 28956772 Free PMC article. Review.
Cited by
-
Plant science decadal vision 2020-2030: Reimagining the potential of plants for a healthy and sustainable future.Plant Direct. 2020 Sep 1;4(8):e00252. doi: 10.1002/pld3.252. eCollection 2020 Aug. Plant Direct. 2020. PMID: 32904806 Free PMC article.
-
Leveraging Structured Biological Knowledge for Counterfactual Inference: A Case Study of Viral Pathogenesis.IEEE Trans Big Data. 2021 Jan 18;7(1):25-37. doi: 10.1109/TBDATA.2021.3050680. eCollection 2021 Mar 1. IEEE Trans Big Data. 2021. PMID: 37981991 Free PMC article.
-
Re-curation and rational enrichment of knowledge graphs in Biological Expression Language.Database (Oxford). 2019 Jan 1;2019:baz068. doi: 10.1093/database/baz068. Database (Oxford). 2019. PMID: 31225582 Free PMC article.
-
Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions.J Biomed Inform. 2023 Apr;140:104341. doi: 10.1016/j.jbi.2023.104341. Epub 2023 Mar 17. J Biomed Inform. 2023. PMID: 36933632 Free PMC article.
-
Broad-coverage biomedical relation extraction with SemRep.BMC Bioinformatics. 2020 May 14;21(1):188. doi: 10.1186/s12859-020-3517-7. BMC Bioinformatics. 2020. PMID: 32410573 Free PMC article.
References
-
- Allen J.F., Swift M. and De Beaumont W. (2008) Deep semantic analysis of text. In: Proceedings of the 2008 Conference on Semantics in Text Processing. Association for Computational Linguistics, pp. 343--354.
-
- Appelt D.E., Hobbs J.R., Bear J. et al. (1993) FASTUS: A finite-state processor for information extraction from real-world text. In: Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI). Morgan Kaufmann, San Mateo, CA.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
