Application of a Domain-specific BERT for Detection of Speech Recognition Errors in Radiology Reports

Radiol Artif Intell. 2022 May 25;4(4):e210185. doi: 10.1148/ryai.210185. eCollection 2022 Jul.

Abstract

Purpose: To develop radiology domain-specific bidirectional encoder representations from transformers (BERT) models that can identify speech recognition (SR) errors and suggest corrections in radiology reports.

Materials and methods: A pretrained BERT model, Clinical BioBERT, was further pretrained on a corpus of 114 008 radiology reports between April 2016 and August 2019 that were retrospectively collected from two hospitals. Next, the model was fine-tuned on a training dataset of generated insertion, deletion, and substitution errors, creating Radiology BERT. This model was retrospectively evaluated on an independent dataset of radiology reports with generated errors (n = 18 885) and on unaltered report sentences (n = 2000) and prospectively evaluated on true clinical SR errors (n = 92). Correction Radiology BERT was separately trained to suggest corrections for detected deletion and substitution errors. Area under the receiver operating characteristic curve (AUC) and bootstrapped 95% CIs were calculated for each evaluation dataset.

Results: Radiology-specific BERT had AUC values of >.99 (95% CI: >0.99, >0.99), 0.94 (95% CI: 0.93, 0.94), 0.98 (95% CI: 0.98, 0.98), and 0.97 (95% CI: 0.97, 0.97) for detecting insertion, deletion, substitution, and all errors, respectively, on the independently generated test set. Testing on unaltered report impressions revealed a sensitivity of 82% (28 of 34; 95% CI: 70%, 93%) and specificity of 88% (1521 of 1728; 95% CI: 87%, 90%). Testing on prospective SR errors showed an accuracy of 75% (69 of 92; 95% CI: 65%, 83%). Finally, the correct word was the top suggestion for 45.6% (475 of 1041; 95% CI: 42.5%, 49.3%) of errors.

Conclusion: Radiology-specific BERT models fine-tuned on generated errors were able to identify SR errors in radiology reports and suggest corrections.Keywords: Computer Applications, Technology Assessment Supplemental material is available for this article. © RSNA, 2022See also the commentary by Abajian and Cheung in this issue.

Keywords: Computer Applications; Technology Assessment.