Background: Stress fractures represent a fatigue failure of bone, occurring with a spectrum of severity of structural injury, and healing potential varies by location. There is no comprehensive classification system for stress fractures incorporating both clinical and radiographic characteristics of the injury that is applicable to all bones. We introduce a system that is reproducible, generalizable, easy to use, and clinically relevant, with three descriptors: fracture grade, fracture location, and imaging modality.
Methods: After a review of current classification systems, a five-tier system was proposed to determine fracture grade: Grade I indicated asymptomatic stress reaction on imaging, Grade II indicated pain with no fracture line, Grade III indicated non-displaced fracture, Grade IV indicated displaced fracture, and Grade V indicated nonunion. Example cases of each grade with clinical vignettes and images were prepared to test the interobserver and intraobserver reliability of the system by the test and retest evaluation among fifteen clinicians. A questionnaire and recall test assessed the ease of use, clinical applicability, and recall accuracy.
Results: Test and retest analysis showed that the system had almost perfect agreement in intraobserver reliability with a kappa value of 0.81. The overall intraobserver reliability showed almost perfect agreement with a kappa value of 0.81. Almost perfect agreement with a kappa value of 0.83 was also produced when these responses were compared with our assessment. The overall interobserver reliability had substantial agreement with a kappa value of 0.78. The reliability of the group compared with that of the answer key was almost perfect with a kappa value of 0.83. The recall test showed an overall accuracy of 97.3%. Of the fifteen evaluators who completed questionnaires, fourteen (93.3%) said that the system would be easily remembered, would facilitate communication among colleagues, and would be useful in clinical practice.
Conclusions: The proposed stress fracture classification system is clinically relevant, easily applied, and generalizable, and has excellent interobserver and intraobserver reliability.