Utilizing soft constraints to enhance medical relation extraction from the history of present illness in electronic medical records

J Biomed Inform. 2018 Nov:87:108-117. doi: 10.1016/j.jbi.2018.09.013. Epub 2018 Oct 4.

Abstract

Relation extraction between medical concepts from electronic medical records has pervasive applications as well as significance. However, previous researches utilizing machine learning algorithms judge the semantic types of medical concept pair mentions independently. In fact, different concept pair mentions in the same context are of dependencies which can provide beneficial evidences for identifying their relation types. To the best of our knowledge, only one study has considered such dependencies in discharge summaries. However, its hard constraints are not applied effectively to the History of Present Illness (HPI) in electronic Medical Records. According to the writing characteristics of HPI records, we generalize two regularities of dependencies among concept pairs mentioned in an HPI record to enhance the performance of relation extraction. We incorporate the two soft constraints corresponding to the regularities and the posterior probabilities returned by a local classifier into a joint inference process which applies Integer Quadratic Programming method to carry out collective classification for all concept pair mentions in an HPI record. We implement four local classification models including support vector machine, logistics regression, random forest and piecewise convolutional neural networks to examine the performance of our approach. A series of experimental results demonstrate that our collective classification method has made a principal improvement and outperforms the other state-of-the-art methods.

Keywords: Electronic medical record; History of present illness; Integer quadratic programming; Relation extraction; Soft constraints.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • China
  • Deep Learning
  • Electronic Health Records*
  • Humans
  • Medical Informatics / methods*
  • Models, Statistical
  • Probability
  • Regression Analysis
  • Reproducibility of Results
  • Support Vector Machine*