Estimating diagnostic test accuracy using a "fuzzy gold standard"

Med Decis Making. Jan-Mar 1995;15(1):44-57. doi: 10.1177/0272989X9501500108.


This study uses Monte Carlo methods to analyze the consequences of having a criterion standard ("gold standard") that contains some error when analyzing the accuracy of a diagnostic test using ROC curves. Two phenomena emerge: 1) When diagnostic test errors are statistically independent from inaccurate ("fuzzy") gold standard (FGS) errors, estimated test accuracy declines. 2) When the test and the FGS have statistically dependent errors, test accuracy can become overstated. Two methods are proposed to eliminate the first of these errors, exploring the risk of exacerbating the second. Both require a probabilistic (rather than binary) gold-standard statement (e.g., probability that each case is abnormal). The more promising of these, the "two-truth" method, selectively eliminates those cases where the gold standard is most ambiguous (probability near 0.5). When diagnostic test and FGS errors are independent, this approach can eliminate much of the downward bias caused by FGS error, without meaningful risk of overstating test accuracy. When the test and FGS have dependent errors, the resultant upward bias can cause test accuracy to be overstated, in the most extreme cases, even before the offsetting "two-truth" approach is employed.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Clinical Laboratory Techniques / standards*
  • Fuzzy Logic*
  • Humans
  • Magnetic Resonance Imaging
  • Monte Carlo Method*
  • Multiple Sclerosis / diagnosis
  • Predictive Value of Tests
  • ROC Curve*
  • Sensitivity and Specificity
  • Stochastic Processes