Extracting clinical terms from radiology reports with deep learning

Kento Sugimoto; Toshihiro Takeda; Jong-Hoon Oh; Shoya Wada; Shozo Konishi; Asuka Yamahata; Shiro Manabe; Noriyuki Tomiyama; Takashi Matsunaga; Katsuyuki Nakanishi; Yasushi Matsumura

doi:10.1016/j.jbi.2021.103729

Extracting clinical terms from radiology reports with deep learning

J Biomed Inform. 2021 Apr:116:103729. doi: 10.1016/j.jbi.2021.103729. Epub 2021 Mar 9.

Authors

Affiliations

¹ Department of Medical Informatics, Osaka University Graduate School of Medicine, Suita, Osaka, Japan; National Institute of Information and Communications Technology, Seika, Kyoto, Japan. Electronic address: sugimoto.kento@hp-info.med.osaka-u.ac.jp.
² Department of Medical Informatics, Osaka University Graduate School of Medicine, Suita, Osaka, Japan.
³ National Institute of Information and Communications Technology, Seika, Kyoto, Japan.
⁴ Department of Medical Informatics, Osaka University Graduate School of Medicine, Suita, Osaka, Japan; National Institute of Information and Communications Technology, Seika, Kyoto, Japan.
⁵ Department of Diagnostic and Interventional Radiology, Osaka University Graduate School of Medicine, Suita, Osaka, Japan.
⁶ Department of Medical Informatics, Osaka International Cancer Institute, Osaka, Japan.
⁷ Department of Diagnostic and Interventional Radiology, Osaka International Cancer Institute, Osaka, Japan.

PMID: 33711545
DOI: 10.1016/j.jbi.2021.103729

Abstract

Extracting clinical terms from free-text format radiology reports is a first important step toward their secondary use. However, there is no general consensus on the kind of terms to be extracted. In this paper, we propose an information model comprising three types of clinical entities: observations, clinical findings, and modifiers. Furthermore, to determine its applicability for in-house radiology reports, we extracted clinical terms with state-of-the-art deep learning models and compared the results. We trained and evaluated models using 540 in-house chest computed tomography (CT) reports annotated by multiple medical experts. Two deep learning models were compared, and the effect of pre-training was explored. To investigate the generalizability of the model, we evaluated the use of other institutional chest CT reports. The micro F1-score of our best performance model using in-house and external datasets were 95.36% and 94.62%, respectively. Our results indicated that entities defined in our information model were suitable for extracting clinical terms from radiology reports, and the model was sufficiently generalizable to be used with dataset from other institutions.

Keywords: Deep Learning; Information Extraction; Natural Language Processing; Radiology Report.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Deep Learning*
Natural Language Processing
Radiology Information Systems*
Radiology*
Research Report
Tomography, X-Ray Computed