Choosing Clinical Variables for Risk Stratification Post-Acute Coronary Syndrome

Sci Rep. 2019 Oct 10;9(1):14631. doi: 10.1038/s41598-019-50933-3.


Most risk stratification methods use expert opinion to identify a fixed number of clinical variables that have prognostic significance. In this study our goal was to develop improved metrics that utilize a variable number of input parameters. We first used Bootstrap Lasso Regression (BLR) - a Machine Learning method for selecting important variables - to identify a prognostic set of features that identify patients at high risk of death 6-months after presenting with an Acute Coronary Syndrome. Using data derived from the Global Registry of Acute Coronary Events (GRACE) we trained a logistic regression model using these features and evaluated its performance on a development set (N = 43,063) containing patients who have values for all features, and a separate dataset (N = 6,363) that contains patients who have missing feature values. The final model, Ridge Logistic Regression with Variable Inputs (RLRVI), uses imputation to estimate values for missing features. BLR identified 19 features, 8 of which appear in the GRACE score. RLRVI had modest, yet statistically significant, improvement over the standard GRACE score on both datasets. Moreover, for patients who are relatively low-risk (GRACE≤87), RLRVI had an AUC and Hazard Ratio of 0.754 and 6.27, respectively, vs. 0.688 and 2.46 for GRACE, (p < 0.007). RLRVI has improved discriminatory performance on patients who have values for the 8 GRACE features plus any subset of the 11 non-GRACE features. Our results demonstrate that BLR and data imputation can be used to obtain improved risk stratification metrics, particularly for patients who are classified as low risk using traditional methods.

Publication types

  • Multicenter Study
  • Observational Study
  • Validation Study

MeSH terms

  • Acute Coronary Syndrome / mortality*
  • Acute Coronary Syndrome / surgery
  • Aged
  • Cohort Studies
  • Female
  • Follow-Up Studies
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Middle Aged
  • Patient Selection
  • Percutaneous Coronary Intervention*
  • Prognosis
  • Registries / statistics & numerical data
  • Risk Assessment / methods
  • Risk Factors