Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area

Alex C Speciale; Ricardo Pietrobon; Chris W Urban; William J Richardson; Clyde A Helms; Nancy Major; David Enterline; Lloyd Hey; Michael Haglund; Dennis A Turner

doi:10.1097/00007632-200205150-00014

Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area

Spine (Phila Pa 1976). 2002 May 15;27(10):1082-6. doi: 10.1097/00007632-200205150-00014.

Authors

Alex C Speciale¹, Ricardo Pietrobon, Chris W Urban, William J Richardson, Clyde A Helms, Nancy Major, David Enterline, Lloyd Hey, Michael Haglund, Dennis A Turner

Affiliation

¹ Department of Radiology, Duke University Medical Center, Durham, North Carolina, USA.

PMID: 12004176
DOI: 10.1097/00007632-200205150-00014

Abstract

Study design: Magnetic resonance image grading of lumbar spinal stenosis severity was analyzed retrospectively using a common clinical format.

Objective: To assess the interobserver and intraobserver reliability of magnetic resonance image used to grade patients with lumbar spinal stenosis, as compared with cross-sectional spinal canal area.

Summary of background data: Physicians currently classify the degree of lumbar spinal stenosis on magnetic resonance imaging as mild, moderate, or severe. Unfortunately, there is no consensus on criteria for these definitions.

Methods: The magnetic resonance image scans of 15 patients with lumbar stenosis were blindly rated by seven observers for the degree of central, lateral recess, and foraminal stenosis between L1-L2 and L5-S1. Weighted kappa statistics were performed to analyze the inter- and intraobserver agreement. Digitized spinal canal area measurements were calculated. Linear regression models were used to assess the reliability of the grading system in predicting the cross-sectional area.

Results: The average interobserver kappa score was 0.26. Within different specialties, the interobserver reliability was higher among radiologists (0.40), followed by neurosurgeons (0.21) and orthopedic surgeons (0.15). The average intraobserver kappa score was 0.11, rising to 0.43 after categories were combined (P = 0.001). The classification of central stenosis highly predicted spinal canal area (P < 0.001).

Conclusions: The findings indicate only a fair level of agreement among all observers. However, the ability of the various readers to predict the degree of central stenosis was high. Further studies should evaluate a consensus-based, standardized magnetic resonance image classification aimed at improved agreement among observers.

MeSH terms

Humans
Linear Models
Magnetic Resonance Imaging / methods*
Magnetic Resonance Imaging / statistics & numerical data
Observer Variation*
Severity of Illness Index
Spinal Canal / pathology
Spinal Stenosis / pathology*