Inference procedures for assessing interobserver agreement among multiple raters

Biometrics. 2001 Jun;57(2):584-8. doi: 10.1111/j.0006-341x.2001.00584.x.

Abstract

We propose a new procedure for constructing inferences about a measure of interobserver agreement in studies involving a binary outcome and multiple raters. The proposed procedure, based on a chi-square goodness-of-fit test as applied to the correlated binomial model (Bahadur, 1961, in Studies in Item Analysis and Prediction, 158-176), is an extension of the goodness-of-fit procedure developed by Donner and Eliasziw (1992, Statistics in Medicine 11, 1511-1519) for the case of two raters. The new procedure is shown to provide confidence-interval coverage levels that are close to nominal over a wide range of parameter combinations. The procedure also provides a sample-size formula that may be used to determine the required number of subjects and raters for such studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binomial Distribution
  • Biometry / methods
  • Confidence Intervals
  • Eye Enucleation / adverse effects
  • Eye Neoplasms / pathology
  • Eye Neoplasms / surgery
  • Humans
  • Models, Statistical
  • Necrosis
  • Observer Variation*
  • Reproducibility of Results