Using NLP in openEHR archetypes retrieval to promote interoperability: a feasibility study in China

BMC Med Inform Decis Mak. 2021 Jun 26;21(1):199. doi: 10.1186/s12911-021-01554-2.

Abstract

Background: With the development and application of medical information system, semantic interoperability is essential for accurate and advanced health-related computing and electronic health record (EHR) information sharing. The openEHR approach can improve semantic interoperability. One key improvement of openEHR is that it allows for the use of existing archetypes. The crucial problem is how to improve the precision and resolve ambiguity in the archetype retrieval.

Method: Based on the query expansion technology and Word2Vec model in Nature Language Processing (NLP), we propose to find synonyms as substitutes for original search terms in archetype retrieval. Test sets in different medical professional level are used to verify the feasibility.

Result: Applying the approach to each original search term (n = 120) in test sets, a total of 69,348 substitutes were constructed. Precision at 5 (P@5) was improved by 0.767, on average. For the best result, the P@5 was up to 0.975.

Conclusions: We introduce a novel approach that using NLP technology and corpus to find synonyms as substitutes for original search terms. Compared to simply mapping the element contained in openEHR to an external dictionary, this approach could greatly improve precision and resolve ambiguity in retrieval tasks. This is helpful to promote the application of openEHR and advance EHR information sharing.

Keywords: Information retrieval; Interoperability; Nature language processing; OpenEHR.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • China
  • Electronic Health Records*
  • Feasibility Studies
  • Humans
  • Natural Language Processing*
  • Semantics