Reproducibility of histomorphologic diagnoses with special reference to the kappa statistic

APMIS. 1989 Aug;97(8):689-98. doi: 10.1111/j.1699-0463.1989.tb00464.x.


Systems for classification and grading used in pathology should ideally be biologically meaningful and at least be reproducible from one pathologist to another. A statistical method to evaluate reproducibility (non-chance agreement) for several observers using nominal or ordinal categories has been developed and refined over the past few decades--the kappa statistic. A high level of observed agreement among different pathologists can either signify a high level of reproducibility, if agreement by chance is low, or express a low level of reproducibility, if agreement by chance is almost as high as the observed agreement. Therefore, the observed agreement says nothing in itself, unless it is low. The kappa value, however, indicates how much better the observers are compared to a throw of the dice, and therefore gives the real credit to the agreement which was found. We have developed a user-friendly computer program for calculating inter- and intra-observer agreement of 2 or more observers. By calculating associations between different categories and different observers, the statistic furthermore obtains a function close to the parameter of accuracy. We recommend the use of the above method before a set of nominal or rank scale parameters are used for deciding prognosis and treatment of patients. By submitting a diskette the computer program will be available at no cost.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Histological Techniques / standards*
  • Pathology / standards*
  • Reproducibility of Results*
  • Statistics as Topic