Advanced literature analysis in a Big Data world
- PMID: 27859320
- DOI: 10.1111/nyas.13270
Advanced literature analysis in a Big Data world
Abstract
Comprehensive data mining of the scientific literature has become an increasing challenge. To address this challenge, Elsevier's Pathway Studio software uses the techniques of natural language processing to systematically extract specific biological information from journal articles and abstracts that is then used to create a very large, structured, and constantly expanding literature knowledgebase. Highly sophisticated visualization tools allow the user to interactively explore the vast number of connections created and stored in the Pathway Studio database. We demonstrate the value of this structured information approach by way of a biomarker use case example and describe a comprehensive collection of biomarkers and biomarker candidates, as reported in the literature. We use four major neuropsychiatric diseases to demonstrate common and unique biomarker elements, demonstrate specific enrichment patterns, and highlight strategies for identifying the most recent and novel reports for potential biomarker discovery. Finally, we introduce an innovative new taxonomy based on brain region identifications, which greatly increases the potential depth and complexity of information retrieval related to, and now accessible for, neuroscience research.
Keywords: Pathway Studio; literature analysis; literature data mining; natural language processing.
© 2016 New York Academy of Sciences.
Similar articles
-
Utilization of ontology look-up services in information retrieval for biomedical literature.Stud Health Technol Inform. 2013;186:155-9. Stud Health Technol Inform. 2013. PMID: 23542988
-
Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.Bioinformatics. 2003;19 Suppl 1:i331-9. doi: 10.1093/bioinformatics/btg1046. Bioinformatics. 2003. PMID: 12855478
-
Validating discovery in literature-based discovery.J Biomed Inform. 2007 Aug;40(4):448-50; author reply 450-2. doi: 10.1016/j.jbi.2007.05.001. Epub 2007 May 16. J Biomed Inform. 2007. PMID: 17616484 No abstract available.
-
Introducing the Big Knowledge to Use (BK2U) challenge.Ann N Y Acad Sci. 2017 Jan;1387(1):12-24. doi: 10.1111/nyas.13225. Epub 2016 Oct 17. Ann N Y Acad Sci. 2017. PMID: 27750400 Free PMC article. Review.
-
Text-based knowledge discovery: search and mining of life-sciences documents.Drug Discov Today. 2002 Jun 1;7(11):S89-98. doi: 10.1016/s1359-6446(02)02286-9. Drug Discov Today. 2002. PMID: 12047886 Review.
Cited by
-
A systematic review on literature-based discovery workflow.PeerJ Comput Sci. 2019 Nov 18;5:e235. doi: 10.7717/peerj-cs.235. eCollection 2019. PeerJ Comput Sci. 2019. PMID: 33816888 Free PMC article.
-
Proteomic Network Analysis of Bronchoalveolar Lavage Fluid in Ex-Smokers to Discover Implicated Protein Targets and Novel Drug Treatments for Chronic Obstructive Pulmonary Disease.Pharmaceuticals (Basel). 2022 May 1;15(5):566. doi: 10.3390/ph15050566. Pharmaceuticals (Basel). 2022. PMID: 35631392 Free PMC article.
-
Identifying genes targeted by disease-associated non-coding SNPs with a protein knowledge graph.PLoS One. 2022 Jul 13;17(7):e0271395. doi: 10.1371/journal.pone.0271395. eCollection 2022. PLoS One. 2022. PMID: 35830458 Free PMC article.
-
Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock.Elife. 2023 Apr 17;12:e82364. doi: 10.7554/eLife.82364. Elife. 2023. PMID: 36975205 Free PMC article.
-
The case for open science: rare diseases.JAMIA Open. 2020 Sep 11;3(3):472-486. doi: 10.1093/jamiaopen/ooaa030. eCollection 2020 Oct. JAMIA Open. 2020. PMID: 33426479 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
