Aim: To determine the inter-observer reproducibility of 15 tests used for predicting difficult tracheal intubation (DI).
Material and methods: Following local ethics committee approval and informed consent, 101 volunteers were examined by two assessors using 15 tests for predicting DI. The two assessors who were blinded to the results of the other, examined each volunteer independently. Cohen's kappa (κ) or first-order agreement coefficient (AC1) were used to measure agreement between assessor ratings on a qualitative scale. Agreement between two quantitative outcomes was described using the intraclass correlation coefficient (ICC) and Pearson's (PCC) or Spearman's (SCC) correlation coefficients. The following interpretation of the coefficients was used: poor (< 0.20), fair (0.21-0.40), satisfactory (0.41-0.60), good (0.61-0.80), and excellent (0.81-1.00).
Results: Respective coefficients of inter-rater agreement and correlation coefficients were determined for the following parameters: pathologies associated with DI (κ=0.662, AC1=0.990), clinical impression (κ=-0.013, AC1=0.969), modified Mallampati test (κ=0.503, AC1=0.861), upper lip bite test (κ=0.370, AC1=0.897), temporo-mandibular joint movement (κ=0.088, AC1=0.797), max. anteroflexion of C-spine (ICC=0.136, SCC=0.391), max. retroflexion of C-spine (ICC=0.020, SCC=0.284), mandibular length (ICC=0.301, SCC=0.553), neck circumference (ICC=0.832, SCC=0.928), hyo-mental distance (ICC=0.378, SCC=0.472), thyro-mental distance (ICC=-0.002, PCC=0.265), sternomental distance (ICC=0.674, PCC=0.815), and finally, inter-incisor gap (ICC=0.695, PCC=0.785). Two tests (positive history of DI and retrogenia), were excluded from calculation because no positive cases were found.
Conclusion: Best inter-rater agreement was found for the assessment of neck circumference while the highest discrepancies between raters were in goniometrically-measured mobility of the C-spine. Many of the pre-operative airway tests had only fair inter-observer reproducibility. This may be one reason why models for predicting difficult intubation are not universally reliable.