Generalized confidence interval for an agreement between raters

Stat Med. 2021 Apr;40(9):2230-2238. doi: 10.1002/sim.8899. Epub 2021 Feb 12.

Abstract

Estimation and inference are two key components toward the solution of any statistical problem; however, the inferential issues of statistical assessment of agreement among two or more raters have not been well developed as compared to the development of estimation procedures in this area. The fundamental reason for this gap is the complex expression of the concordance correlation coefficient (CCC) that is frequently used in assessing agreement among raters. Large sample-based statistical tests for CCC often fail to produce desired results for small samples. Hence, inferential procedures for small samples are urgently needed to evaluate agreement between raters. We argue that hypothesis testing of CCC has little value in practice due to the absence of a gold standard of agreement. In this article, we construct the generalized confidence interval (GCI) for CCC utilizing a bivariate normal distribution of measurements, and also develop a large sample-based confidence interval (LSCI). We establish satisfactory performance of GCI by providing the desired coverage probability (CP) via simulation. Results of GCI and LSCI are illustrated and compared with a data set of a recent study performed at U.S. Department of Veterans Affairs, Hines.

Keywords: agreement; concordance correlation coefficient; coverage probability; generalized confidence interval.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Confidence Intervals
  • Humans
  • Models, Statistical*
  • Observer Variation
  • Reproducibility of Results
  • Research Design*