Confidence intervals for the receiver operating characteristic area in studies with small samples

Acad Radiol. 1998 Aug;5(8):561-71. doi: 10.1016/s1076-6332(98)80208-0.


Rationale and objectives: The authors performed this study to address two practical questions. First, how large does the sample size need to be for confidence intervals (CIs) based on the usual asymptotic methods to be appropriate? Second, when the sample size is smaller than this threshold, what alternative method of CI construction should be used?

Materials and methods: The authors performed a Monte Carlo simulation study where 95% CIs were constructed for the receiver operating characteristic (ROC) area and for the difference between two ROC areas for rating and continuous test results--for ROC areas of moderate and high accuracy--by using both parametric and nonparametric estimation methods. Alternative methods evaluated included several bootstrap CIs and CIs with the Student t distribution.

Results: For the difference between two ROC areas, CIs based on the asymptotic theory provided adequate coverage even when the sample size was very small (20 patients). In contrast, for a single ROC area, the asymptotic methods do not provide adequate CI coverage for small samples; for ROC areas of high accuracy, the sample size must be large (more than 200 patients) for the asymptotic methods to be applicable. The recommended alternative (bootstrap percentile, bootstrap t, or bootstrap bias-corrected accelerated method) depends on the estimation approach, format of the test results, and ROC area.

Conclusion: Currently, there is not a single best alternative for constructing CIs for a single ROC area for small samples.

MeSH terms

  • Confidence Intervals*
  • Monte Carlo Method
  • ROC Curve*
  • Sample Size