Medical imaging is fundamental to cardiovascular diagnostics, with modalities such as Transthoracic Echocardiography (TTE) and Cardiac Magnetic Resonance (CMR) offering complementary strengths. TTE provides real-time, non-invasive visualization of cardiac function but is often limited by operator dependency and incomplete views. In contrast, CMR delivers comprehensive, high-resolution structural assessments, although it comes with greater time and cost burdens. To address these limitations, this study explores cross-modal generative modeling techniques for synthesizing CMR-like images directly from TTE. We propose a novel architecture that combines a UNet backbone with a vision transformer, utilizing the UNet for feature extraction and the transformer for global attention to improve image synthesis quality. Quantitative and qualitative evaluations demonstrate the model's ability to produce realistic and anatomically consistent CMR images, with strong potential to improve diagnostic accuracy and clinical decision-making across multiple image modalities.
Keywords: Echocardiography; Generative AI; MRI.
Copyright © 2026 The Authors. Published by Elsevier Inc. All rights reserved.