Purpose: Visual EEG Confusion Assessment Method-Severity (VE-CAM-S) quantifies encephalopathy severity based on electroencephalography features. This study evaluated inter-rater reliability among experts using the VE-CAM-S scale.
Methods: Nine experts from six institutions independently reviewed 32 15-second electroencephalography samples in an online test, assessing 29 features (16 in the VE-CAM-S and 13 additional, or "VE-CAM-S+"). A consensus of three experts served as the gold standard. Performance was measured by the median Matthews correlation coefficient between expert and gold-standard VE-CAM-S+ scores, along with average sensitivity and specificity. Qualitative analysis identified common feature-recognition errors affecting scores.
Results: Experts achieved a median Matthews correlation coefficient of 0.82 [95% CI: 0.74-0.99]. Specificity exceeded 90% for most features except background β (87%) and generalized delta (71%). Sensitivity was ≥65% except for burst suppression with epileptiform activity (61%), extreme delta brush (EDB; 61%), posterior dominant rhythm (50%), background α (59%) and β (42%). Common errors included missing subtle findings, confusing features, and misidentifying extreme delta brush.
Conclusions: This pilot study offers some initial support for the reliability of VE-CAM-S+ scoring. The largest errors occurred when experts missed or falsely identified features with higher weight in the VE-CAM-S. Encephalopathy grading through VE-CAM-S may be improved by breaking high-stakes features into smaller parts, creating a "cheat sheet" with scored examples, and designing teaching materials.
Keywords: Critical care; Electroencephalography (EEG); Encephalopathy; Pilot study; Reliability (inter-rater reliability, IRR); Teaching.
Copyright © 2025 by the American Clinical Neurophysiology Society.