Measuring agreement between two judges on the presence or absence of a trait

Biometrics. 1975 Sep;31(3):651-9.


At least a dozen indexes have been proposed for measuring agreement between two judges on a categorical scale. Using the binary (positive-negative) case as a model, this paper presents and critically evaluates some of these proposed measures. The importance of correcting for chance-expected agreement is emphasized, and identities with intraclass correlation coefficients are pointed out.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Analysis of Variance
  • Humans
  • Judgment*
  • Mathematics
  • Models, Psychological
  • Statistics as Topic