AI Based CMR Assessment of Biventricular Function: Clinical Significance of Intervendor Variability and Measurement Errors

JACC Cardiovasc Imaging. 2022 Mar;15(3):413-427. doi: 10.1016/j.jcmg.2021.08.011. Epub 2021 Oct 13.


Objectives: The aim of this study was to determine whether left ventricular ejection fraction (LVEF) and right ventricular ejection fraction (RVEF) and left ventricular mass (LVM) measurements made using 3 fully automated deep learning (DL) algorithms are accurate and interchangeable and can be used to classify ventricular function and risk-stratify patients as accurately as an expert.

Background: Artificial intelligence is increasingly used to assess cardiac function and LVM from cardiac magnetic resonance images.

Methods: Two hundred patients were identified from a registry of individuals who underwent vasodilator stress cardiac magnetic resonance. LVEF, LVM, and RVEF were determined using 3 fully automated commercial DL algorithms and by a clinical expert (CLIN) using conventional methodology. Additionally, LVEF values were classified according to clinically important ranges: <35%, 35% to 50%, and ≥50%. Both ejection fraction values and classifications made by the DL ejection fraction approaches were compared against CLIN ejection fraction reference. Receiver-operating characteristic curve analysis was performed to evaluate the ability of CLIN and each of the DL classifications to predict major adverse cardiovascular events.

Results: Excellent correlations were seen for each DL-LVEF compared with CLIN-LVEF (r = 0.83-0.93). Good correlations were present between DL-LVM and CLIN-LVM (r = 0.75-0.85). Modest correlations were observed between DL-RVEF and CLIN-RVEF (r = 0.59-0.68). A >10% error between CLIN and DL ejection fraction was present in 5% to 18% of cases for the left ventricle and 23% to 43% for the right ventricle. LVEF classification agreed with CLIN-LVEF classification in 86%, 80%, and 85% cases for the 3 DL-LVEF approaches. There were no differences among the 4 approaches in associations with major adverse cardiovascular events for LVEF, LVM, and RVEF.

Conclusions: This study revealed good agreement between automated and expert-derived LVEF and similarly strong associations with outcomes, compared with an expert. However, the ability of these automated measurements to accurately classify left ventricular function for treatment decision remains limited. DL-LVM showed good agreement with CLIN-LVM. DL-RVEF approaches need further refinements.

Keywords: deep learning; ejection fraction; machine learning; ventricular function.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Cardiovascular Diseases*
  • Heart Ventricles / diagnostic imaging
  • Humans
  • Predictive Value of Tests
  • Stroke Volume
  • Ventricular Function, Left
  • Ventricular Function, Right*