Kappa statistic to measure agreement beyond chance in free-response assessments

BMC Med Res Methodol. 2017 Apr 19;17(1):62. doi: 10.1186/s12874-017-0340-6.


Background: The usual kappa statistic requires that all observations be enumerated. However, in free-response assessments, only positive (or abnormal) findings are notified, but negative (or normal) findings are not. This situation occurs frequently in imaging or other diagnostic studies. We propose here a kappa statistic that is suitable for free-response assessments.

Method: We derived the equivalent of Cohen's kappa statistic for two raters under the assumption that the number of possible findings for any given patient is very large, as well as a formula for sampling variance that is applicable to independent observations (for clustered observations, a bootstrap procedure is proposed). The proposed statistic was applied to a real-life dataset, and compared with the common practice of collapsing observations within a finite number of regions of interest.

Results: The free-response kappa is computed from the total numbers of discordant (b and c) and concordant positive (d) observations made in all patients, as 2d/(b + c + 2d). In 84 full-body magnetic resonance imaging procedures in children that were evaluated by 2 independent raters, the free-response kappa statistic was 0.820. Aggregation of results within regions of interest resulted in overestimation of agreement beyond chance.

Conclusions: The free-response kappa provides an estimate of agreement beyond chance in situations where only positive findings are reported by raters.

Keywords: Biostatistics; Methodological Study; Reliability (Epidemiology); Reproducibility of results.

MeSH terms

  • Child
  • Datasets as Topic
  • Humans
  • Magnetic Resonance Imaging*
  • Observer Variation
  • Statistics as Topic*