Automated extraction and normalization of findings from cancer-related free-text radiology reports

AMIA Annu Symp Proc. 2003;2003:420-4.


We describe the performance of a particular natural language processing system that uses knowledge vectors to extract findings from radiology reports. LifeCode (A-Life Medical, Inc.) has been successfully coding reports for billing purposes for several years. In this study, we describe the use of LifeCode to code all findings within a set of 500 cancer-related radiology reports against a test set in which all findings were manually tagged. The system was trained with 1400 reports prior to running the test set.

Results: LifeCode had a recall of 84.5% and precision of 95.7% in the coding of cancer-related radiology report findings.

Conclusion: Despite the use of a modest sized training set and minimal training iterations, when applied to cancer-related reports the system achieved recall and precision measures comparable to other reputable natural language processors in this domain.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Expert Systems*
  • Forms and Records Control / methods
  • Humans
  • Information Storage and Retrieval / methods
  • International Classification of Diseases
  • Natural Language Processing*
  • Neoplasms / diagnostic imaging*
  • Radiography, Thoracic / classification*
  • Radiology Information Systems*
  • Software
  • Vocabulary, Controlled