Machine Learning Prediction of Mortality and Hospitalization in Heart Failure With Preserved Ejection Fraction

JACC Heart Fail. 2020 Jan;8(1):12-21. doi: 10.1016/j.jchf.2019.06.013. Epub 2019 Oct 9.


Objectives: This study sought to develop models for predicting mortality and heart failure (HF) hospitalization for outpatients with HF with preserved ejection fraction (HFpEF) in the TOPCAT (Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist) trial.

Background: Although risk assessment models are available for patients with HF with reduced ejection fraction, few have assessed the risks of death and hospitalization in patients with HFpEF.

Methods: The following 5 methods: logistic regression with a forward selection of variables; logistic regression with a lasso regularization for variable selection; random forest (RF); gradient descent boosting; and support vector machine, were used to train models for assessing risks of mortality and HF hospitalization through 3 years of follow-up and were validated using 5-fold cross-validation. Model discrimination and calibration were estimated using receiver-operating characteristic curves and Brier scores, respectively. The top prediction variables were assessed by using the best performing models, using the incremental improvement of each variable in 5-fold cross-validation.

Results: The RF was the best performing model with a mean C-statistic of 0.72 (95% confidence interval [CI]: 0.69 to 0.75) for predicting mortality (Brier score: 0.17), and 0.76 (95% CI: 0.71 to 0.81) for HF hospitalization (Brier score: 0.19). Blood urea nitrogen levels, body mass index, and Kansas City Cardiomyopathy Questionnaire (KCCQ) subscale scores were strongly associated with mortality, whereas hemoglobin level, blood urea nitrogen, time since previous HF hospitalization, and KCCQ scores were the most significant predictors of HF hospitalization.

Conclusions: These models predict the risks of mortality and HF hospitalization in patients with HFpEF and emphasize the importance of health status data in determining prognosis. (Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist [TOPCAT]; NCT00094302).

Keywords: HFpEF; KCCQ; health status; risk.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Aged
  • Argentina / epidemiology
  • Brazil / epidemiology
  • Canada / epidemiology
  • Double-Blind Method
  • Female
  • Health Status*
  • Heart Failure / mortality*
  • Heart Failure / physiopathology
  • Hospitalization / statistics & numerical data*
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Prognosis
  • ROC Curve
  • Risk Assessment / methods*
  • Risk Factors
  • Stroke Volume / physiology*
  • Survival Rate / trends
  • United States / epidemiology

Associated data