Advanced literature analysis in a Big Data world

Ann N Y Acad Sci. 2017 Jan;1387(1):25-33. doi: 10.1111/nyas.13270. Epub 2016 Nov 10.

Abstract

Comprehensive data mining of the scientific literature has become an increasing challenge. To address this challenge, Elsevier's Pathway Studio software uses the techniques of natural language processing to systematically extract specific biological information from journal articles and abstracts that is then used to create a very large, structured, and constantly expanding literature knowledgebase. Highly sophisticated visualization tools allow the user to interactively explore the vast number of connections created and stored in the Pathway Studio database. We demonstrate the value of this structured information approach by way of a biomarker use case example and describe a comprehensive collection of biomarkers and biomarker candidates, as reported in the literature. We use four major neuropsychiatric diseases to demonstrate common and unique biomarker elements, demonstrate specific enrichment patterns, and highlight strategies for identifying the most recent and novel reports for potential biomarker discovery. Finally, we introduce an innovative new taxonomy based on brain region identifications, which greatly increases the potential depth and complexity of information retrieval related to, and now accessible for, neuroscience research.

Keywords: Pathway Studio; literature analysis; literature data mining; natural language processing.

Publication types

  • Review

MeSH terms

  • Abstracting and Indexing
  • Animals
  • Anxiety Disorders / classification
  • Anxiety Disorders / diagnosis
  • Anxiety Disorders / metabolism
  • Anxiety Disorders / therapy
  • Biomarkers / metabolism
  • Biomedical Research / methods*
  • Biomedical Research / trends
  • Bipolar Disorder / classification
  • Bipolar Disorder / diagnosis
  • Bipolar Disorder / metabolism
  • Bipolar Disorder / therapy
  • Computational Biology / methods*
  • Computational Biology / trends
  • Data Mining / methods*
  • Data Mining / trends
  • Database Management Systems* / trends
  • Databases, Bibliographic
  • Depressive Disorder, Major / classification
  • Depressive Disorder, Major / diagnosis
  • Depressive Disorder, Major / metabolism
  • Depressive Disorder, Major / therapy
  • Humans
  • Mass Screening / methods*
  • Mass Screening / trends
  • Mental Disorders / classification
  • Mental Disorders / diagnosis*
  • Mental Disorders / metabolism
  • Mental Disorders / therapy
  • National Institute of Mental Health (U.S.)
  • Natural Language Processing*
  • Periodicals as Topic
  • Prognosis
  • Schizophrenia / classification
  • Schizophrenia / diagnosis
  • Schizophrenia / metabolism
  • Schizophrenia / therapy
  • Software
  • Translational Research, Biomedical / methods
  • Translational Research, Biomedical / trends
  • United States

Substances

  • Biomarkers