Recurrent Deep Network Models for Clinical NLP Tasks: Use Case with Sentence Boundary Disambiguation

Stud Health Technol Inform. 2019 Aug 21:264:198-202. doi: 10.3233/SHTI190211.

Abstract

Although a number of foundational natural language processing (NLP) tasks like text segmentation are considered a simple problem in the general English domain dominated by well-formed text, complexities of clinical documentation lead to poor performance of existing solutions designed for the general English domain. We present an alternative solution that relies on a convolutional neural network layer followed by a bidirectional long short-term memory layer (CNN-Bi-LSTM) for the task of sentence boundary disambiguation and describe an ensemble approach for domain adaptation using two training corpora. Implementations using the Keras neural-networks API are available at https://github.com/NLPIE/clinical-sentences.

Keywords: Machine Learning; Natural Language Processing; Neural Networks (Computer).

MeSH terms

  • Documentation
  • Language
  • Natural Language Processing*
  • Neural Networks, Computer*