Automatic Extraction of Skin and Soft Tissue Infection Status from Clinical Notes

Stud Health Technol Inform. 2024 Jan 25:310:579-583. doi: 10.3233/SHTI231031.

Abstract

The reliable identification of skin and soft tissue infections (SSTIs) from electronic health records is important for a number of applications, including quality improvement, clinical guideline construction, and epidemiological analysis. However, in the United States, types of SSTIs (e.g. is the infection purulent or non-purulent?) are not captured reliably in structured clinical data. With this work, we trained and evaluated a rule-based clinical natural language processing system using 6,576 manually annotated clinical notes derived from the United States Veterans Health Administration (VA) with the goal of automatically extracting and classifying SSTI subtypes from clinical notes. The trained system achieved mention- and document-level performance metrics of the range 0.39 to 0.80 for mention level classification and 0.49 to 0.98 for document level classification.

Keywords: Electronic Health Records; Natural Language Processing; Skin and Soft Tissue Infections.

MeSH terms

  • Benchmarking
  • Electronic Health Records
  • Humans
  • Natural Language Processing
  • Skin
  • Soft Tissue Infections* / diagnosis
  • United States