Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 4:2:7.
doi: 10.1186/s41512-018-0029-2. eCollection 2018.

The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models

Affiliations

The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models

Michael W Kattan et al. Diagn Progn Res. .

Abstract

Background: Many measures of prediction accuracy have been developed. However, the most popular ones in typical medical outcome prediction settings require additional investigation of calibration.

Methods: We show how rescaling the Brier score produces a measure that combines discrimination and calibration in one value and improves interpretability by adjusting for a benchmark model. We have called this measure the index of prediction accuracy (IPA). The IPA permits a common interpretation across binary, time to event, and competing risk outcomes. We illustrate this measure using example datasets.

Results: The IPA is simple to compute, and example code is provided. The values of the IPA appear very interpretable.

Conclusions: IPA should be a prominent measure reported in studies of medical prediction model performance. However, IPA is only a measure of average performance and, by default, does not measure the utility of a medical decision.

Keywords: Accuracy; Brier score; Prediction.

PubMed Disclaimer

Conflict of interest statement

Patient data were registered prospectively in a database approved by the Danish Data Protection Agency (file 2006-41-6256). The Committees on Health Research Ethics in the Capital Region of Denmark approved the study (H-2-2012-134).The authors declare that they have no competing interests.Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Performance metric as a function of event prevalence. Legend: the solid line is the Brier score of the model which predicts prevalence to all subjects. The dashed line is the corresponding root Brier score
Fig. 2
Fig. 2
Illustration of IPA as a function of discrimination, without and with miscalibration
Fig. 3
Fig. 3
Absolute risk of progression accounting for non-cancer death as a competing risk
Fig. 4
Fig. 4
Illustration for the effect on IPA from changing the prediction horizon
Fig. 5
Fig. 5
Comparison of rival prediction models

Similar articles

Cited by

References

    1. Harrell FE, Jr, et al. Evaluating the yield of medical tests. JAMA. 1982;247(18):2543–2546. doi: 10.1001/jama.1982.03320430047030. - DOI - PubMed
    1. Harrell FE, Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–387. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. - DOI - PubMed
    1. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36. doi: 10.1148/radiology.143.1.7063747. - DOI - PubMed
    1. Heagerty P, Zheng Y. Survival model predictive accuracy and ROC curves. Biometrics. 2005;61(1):92–106. doi: 10.1111/j.0006-341X.2005.030814.x. - DOI - PubMed
    1. Blanche P, Dartigues J, Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat Med. 2013;32(30):5381–5397. doi: 10.1002/sim.5958. - DOI - PubMed

LinkOut - more resources