A score card for upper GI endoscopy: Evaluation of interobserver variability in examiners with various levels of experience

Z Gastroenterol. 2002 Oct;40(10):857-62. doi: 10.1055/s-2002-35258.

Abstract

Background: In most European countries, training in GI endoscopy has largely been based on hands-on acquisition of experience in patients rather than on a structured training programme. With the development of training models systematic hands-on training in a variety of diagnostic and therapeutic endoscopy techniques was achieved. Little, however, is known about methods of objectively assessing trainees' performance. We therefore developed an assessment 'score card' for upper GI endoscopy and tested it in endoscopists with various levels of experience. The aim of the study was therefore to assess interobserver variations in the evaluation of trainees.

Methods: On the basis of textbook and expert opinions a consensus group of eight experienced endoscopists developed a score card for diagnostic upper GI endoscopy with biopsy. The score card includes an assessment of the single steps of the procedure as well as of the times needed to complete each step. This score card was then evaluated in a further conference including ten experts who blindly assessed videotapes of 15 endoscopists performing upper GI endoscopy in a training bio-simulation model (the 'Erlangen Endo-Trainer'). On the basis of their previous experience (i. e. the number of endoscopies performed) these 15 endoscopists were classified into four groups: very experienced, experienced, having some experience and inexperienced. Interobserver variability (IOV) was tested for the various score card parameters (Kendall's rank-correlation coefficient 0.0-0.5 poor, 0.5-1.0 good agreement). In addition, the correlation between the score card assessment and the examiners' experience levels was analysed.

Results: Despite poor IOV results for all the parameters tested (Kendall coefficient < 0.3), the assessment parameters correlated well when the examiners' different experience levels were taken into account (correlation coefficient 0.59-0.89, p < 0.05). The score card parameters were suitable for differentiating between the four groups of examiners with different levels of endoscopic experience.

Conclusions: As expected with scores involving subjective assessment of performance, the variability between reviewers was substantial. Nevertheless, the assessment score was capable of distinguishing reliably between different experience levels in terms of a good individual observer consistency. The score card can therefore be used to document both training status and progress during endoscopy training courses using bio-simulation models, and this might be able to provide improved quality assurance in GI endoscopy training.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Animals
  • Clinical Competence / statistics & numerical data*
  • Curriculum
  • Educational Measurement / statistics & numerical data
  • Endoscopy, Gastrointestinal / statistics & numerical data*
  • Gastroenterology / education*
  • Germany
  • Humans
  • Models, Anatomic
  • Observer Variation
  • Swine
  • Video Recording / instrumentation