SPARNet: A Framework for Airway Invasion Tracking from Fluoroscopic Videos of Dysphagia Patients

IEEE J Biomed Health Inform. 2026 May 19:PP. doi: 10.1109/JBHI.2026.3695144. Online ahead of print.

Abstract

The videofluoroscopic swallowing study (VFSS) is the clinical gold standard for evaluating dysphagia and detecting airway invasion. However, manual interpretation is time-consuming, subjective, and prone to rater variability due to the rapid and complex nature of swallowing events. Existing deep learning approaches for airway invasion are limited by relying on manual frame selection, a lack of interpretability, and black-box decision-making. In this study, we propose SPARNet (Swallowing Phase Airway Risk Network), AI pipeline integrates swallowing-phase classification, bolus and airway delineation, and airway-invasion tracking to provide a clinically interpretable assessment. SPARNet introduces a novel Airway Invasion Tracker (AIT) that partitions the visible laryngo-tracheal air column in VFSS into quadrants, tracks the location of the bolus, and computes temporal invasion parameters (event duration, total exposure time, and number of invasion episodes) to characterize airway invasion during the pharyngeal phase. We curated a VFSS-MultiLabel dataset comprising 125 clips from 75 patients, annotated for multiple tasks. SPARNet achieved strong performance across all components, including the pharyngeal phase detection (F1-score = 88%), bolus segmentation (Dice score = 0.83), laryngo-tracheal air column detection (mAP = 0.95), and airway invasion detection (F1-score = 92%). Unlike prior black-box models, SPARNet provides transparent spatial and temporal cues that align with clinical VFSS reasoning. These results demonstrate the feasibility of an system-level AI pipeline for VFSS analysis, highlighting the potential to support clinical decision-making in the management of dysphagia.