Interobserver reliability of Diméglio and Pirani score and their subcomponents in the evaluation of idiopathic clubfoot in a clinical setting: a need for improved scoring systems

J Child Orthop. 2019 Oct 1;13(5):478-485. doi: 10.1302/1863-2548.13.190010.


Purpose: Diméglio (DimS) and Pirani (PirS) scores are the most commonly used scoring systems for evaluation of clubfoot, with many centres performing both. Interobserver reliability of their global score has been rated high in a few studies, but agreement of their subcomponents has been poorly investigated. The aim of the study was to assess interrater reliability of global scores and of items in a clinical setting and to analyse overlapping features of the two scores.

Methods: Fifty-six consecutive idiopathic clubfeet undergoing correction using the Ponseti method were independently evaluated at each casting session by two trained paediatric orthopaedic surgeons using both scores. Interobserver reliability of collected data was analysed; a kappa coefficient > 0.60 was considered adequate.

Results: For DimS and PirS, the Pearson correlation coefficients were 0.87 and 0.91 (p < .0001) respectively, and kappa coefficients were 0.23 and 0.31. Among subcomponents, kappa values were rated > 0.60 only for equinus and curvature of lateral border in PirS; muscular abnormality in DimS was rated 0.74 but a high prevalence index (0.94) indicated influence of scarce prevalence of this feature. All other items showed k < 0.60 and were considered to be improved.For overlapping features: posterior and medial crease showed similar agreement in the two systems, items describing equinus and midfoot adduction were much more reliable in PirS than in DimS.

Conclusions: In a clinical setting, despite a high correlation of evaluations for total scores, the interobserver agreement of DimS and PirS was not adequate and only a few items were substantially reliable. Simultaneous use of two scores seemed redundant and some overlapping features showed different reliability according to criterion or scale used. Future scoring systems should improve these limitations.

Level of evidence: Level I - Diagnostic studies.

Keywords: Diméglio score; Pirani score; Ponseti method; clubfoot; reliability.