Inter-rater Reliability and Validity of Holgers Scores for the Assessment of Bone-anchored Hearing Implant Images

Otol Neurotol. 2019 Feb;40(2):200-203. doi: 10.1097/MAO.0000000000002100.

Abstract

Objectives: This study aims to review the utility and interassessor reliability of Holgers classification by simultaneously testing various professionals of the bone-anchored implant team for their impression of a series of randomized images.

Study design: Retrospective review of a randomized series of bone-anchored implant fixture clinical photographs from the database at a tertiary referral university hospital. Raters were blinded to the contemporaneous Holgers grading assigned by the Clinical Nurse Specialist at initial assessment. Multivariate analysis was performed for correlation between scores for assessors and between grades of assessor.

Setting: Queen Elizabeth Hospital, Birmingham, UK a tertiary center for BAHIs.

Patients: Patients implanted from May 2012 until November 2014.

Main outcome measure: Photographs of fixture sites of adult patients were taken following bone-anchored hearing implant surgery using either a tissue reduction (a split skin graft or linear incision technique was used) or tissue preservation approach, at 1 week, 6 months, and 12 months postoperatively. On a single occasion 263 images were reviewed by 10 assessors (2 consultants, 2 higher surgical trainees, 3 junior doctors, and 3 audiologists). Images were displayed at 10-second intervals and were scored by each assessor. Assessors were blinded to patient identity, time points and to each other's scores. Results were then compared against the real-time scoring of Holgers grades done by the BAHI specialist nurse to compare scores.

Results: Overall 227 (86.2%) images were with tissue reduction technique of which 110 (41.8%) were with linear incision and 117 (44.4%) were with a split skin graft (SSG); and 36 (13.6%) were with tissue preservation technique. Of these 263 images, 104 were at 1 week (39.5%), 70 were at 6 months (26.6%), and 89 were at 12 months (33.9%). The cumulative scores for each grade scored by blinding the time points were: 0 = 1132 (43.04%), total 1 = 995 (37.83%), total 2 = 346 (13.15%), total 3 = 141 (5.36%), total 4 = 16 (0.6%). 2630 data points had a variance of only 0.6415 for each nominal. Multivariate correlation between all assessors was r =0.7230 (Pearson's R). Correlations between consultants r=0.6317, higher surgical trainees r=0.7351, junior doctors r=0.7599, and audiologists r=0.7981.There is a good correlation (r=0.89) with no statistically significant differences between the SSG and linear incision groups (p>0.05), possibly suggesting Holgers score is comparable within both these tissue reduction techniques.There is a moderate correlation (r=0.58) with statistically significant differences between tissue preservation versus tissue reduction groups (p<0.05), possibly suggesting tissue preservation gives better results with lower Holgers scores than tissue reduction.

Conclusion: Holgers scoring system is a reliable tool with respect to inter-rater variability across all levels of experience. Correlation was closer with audiologists and lesser experienced assessors.

MeSH terms

  • Adult
  • Bone-Anchored Prosthesis / adverse effects*
  • Female
  • Hearing Aids / adverse effects*
  • Humans
  • Male
  • Middle Aged
  • Postoperative Complications / diagnosis*
  • Postoperative Complications / pathology*
  • Reproducibility of Results
  • Retrospective Studies
  • Severity of Illness Index*