Meta-analysis of diagnostic test accuracy studies with multiple thresholds using survival methods

Biom J. 2010 Feb;52(1):95-110. doi: 10.1002/bimj.200900073.


Diagnostic tests play an important role in clinical practice. The objective of a diagnostic test accuracy study is to compare an experimental diagnostic test with a reference standard. The majority of these studies dichotomize test results into two categories: negative and positive. But often the underlying test results may be categorized into more than two, ordered, categories. This article concerns the situation where multiple studies have evaluated the same diagnostic test with the same multiple thresholds in a population of non-diseased and diseased individuals. Recently, bivariate meta-analysis has been proposed for the pooling of sensitivity and specificity, which are likely to be negatively correlated within studies. These ideas have been extended to the situation of diagnostic tests with multiple thresholds, leading to a multinomial model with multivariate normal between-study variation. This approach is efficient, but computer-intensive and its convergence is highly dependent on starting values. Moreover, monotonicity of the sensitivities/specificities for increasing thresholds is not guaranteed. Here, we propose a Poisson-correlated gamma frailty model, previously applied to a seemingly quite different situation, meta-analysis of paired survival curves. Since the approach is based on hazards, it guarantees monotonicity of the sensitivities/specificities for increasing thresholds. The approach is less efficient than the multinomial/normal approach. On the other hand, the Poisson-correlated gamma frailty model makes no assumptions on the relationship between sensitivity and specificity, gives consistent results, appears to be quite robust against different between-study variation models, and is computationally very fast and reliable with regard to the overall sensitivities/specificities.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Diagnostic Tests, Routine / methods*
  • Diagnostic Tests, Routine / standards
  • Humans
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Survival Analysis