A semantic and syntactic text simplification tool for health content

AMIA Annu Symp Proc. 2010 Nov 13:2010:366-70.

Abstract

Text simplification is a challenging NLP task and it is particularly important in the health domain as most health information requires higher reading skills than an average consumer has. This low readability of health content is largely due to the presence of unfamiliar medical terms/concepts and certain syntactic characteristics, such as excessively complex sentences. In this paper, we discuss a simplification tool that was developed to simplify health information. The tool addresses semantic difficulty by substituting difficult terms with easier synonyms or through the use of hierarchically and/or semantically related terms. The tool also simplifies long sentences by splitting them into shorter grammatical sentences. We used the tool to simplify electronic medical records and journal articles and results show that the tool simplifies both document types though by different degrees. A cloze test on the electronic medical records showed a statistically significant improvement in the cloze score from 35.8% to 43.6%.

MeSH terms

  • Comprehension
  • Electronic Health Records
  • Language
  • Paper
  • Semantics*
  • Vocabulary*