Background: For the evaluation and comparison of markers and risk prediction models, various novel measures have recently been introduced as alternatives to the commonly used difference in the area under the receiver operating characteristic (ROC) curve (ΔAUC). The net reclassification improvement (NRI) is increasingly popular to compare predictions with 1 or more risk thresholds, but decision-analytic approaches have also been proposed.
Objective: . We aimed to identify the mathematical relationships between novel performance measures for the situation that a single risk threshold T is used to classify patients as having the outcome or not.
Methods: . We considered the NRI and 3 utility-based measures that take misclassification costs into account: difference in net benefit (ΔNB), difference in relative utility (ΔRU), and weighted NRI (wNRI). We illustrate the behavior of these measures in 1938 women suspect of having ovarian cancer (prevalence 28%).
Results: . The 3 utility-based measures appear to be transformations of each other and hence always lead to consistent conclusions. On the other hand, conclusions may differ when using the standard NRI, depending on the adopted risk threshold T, prevalence P, and the obtained differences in sensitivity and specificity of the 2 models that are compared. In the case study, adding the CA-125 tumor marker to a baseline set of covariates yielded a negative NRI yet a positive value for the utility-based measures.
Conclusions: . The decision-analytic measures are each appropriate to indicate the clinical usefulness of an added marker or compare prediction models since these measures each reflect misclassification costs. This is of practical importance as these measures may thus adjust conclusions based on purely statistical measures. A range of risk thresholds should be considered in applying these measures.
Keywords: clinical prediction rules; decision analysis; decision rules.