A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data

Int J Med Inform. 2019 May:125:37-46. doi: 10.1016/j.ijmedinf.2019.02.008. Epub 2019 Feb 20.

Abstract

Objective: In this systematic review, we aim to synthesize the literature on the use of natural language processing (NLP) and text mining as they apply to symptom extraction and processing in electronic patient-authored text (ePAT).

Materials and methods: A comprehensive literature search of 1964 articles from PubMed and EMBASE was narrowed to 21 eligible articles. Data related to purpose, text source, number of users and/or posts, evaluation metrics, and quality indicators were recorded.

Results: Pain (n = 18) and fatigue and sleep disturbance (n = 18) were the most frequently evaluated symptom clinical content categories. Studies accessed ePAT from sources such as Twitter and online community forums or patient portals focused on diseases, including diabetes, cancer, and depression. Fifteen studies used NLP as a primary methodology. Studies reported evaluation metrics including the precision, recall, and F-measure for symptom-specific research questions.

Discussion: NLP and text mining have been used to extract and analyze patient-authored symptom data in a wide variety of online communities. Though there are computational challenges with accessing ePAT, the depth of information provided directly from patients offers new horizons for precision medicine, characterization of sub-clinical symptoms, and the creation of personal health libraries as outlined by the National Library of Medicine.

Conclusion: Future research should consider the needs of patients expressed through ePAT and its relevance to symptom science. Understanding the role that ePAT plays in health communication and real-time assessment of symptoms, through the use of NLP and text mining, is critical to a patient-centered health system.

Keywords: Electronic patient-authored text; Natural language processing; Review; Signs and symptoms.

Publication types

  • Research Support, N.I.H., Extramural
  • Systematic Review

MeSH terms

  • Data Mining*
  • Electronic Health Records*
  • Humans
  • National Institutes of Health (U.S.)
  • Natural Language Processing*
  • Patient Participation*
  • United States