Simultaneous profiling of spatial transcriptomics (ST) and spatial metabolomics (SM) on the same or adjacent tissue sections offers a revolutionary approach to decode tissue microenvironment and identify potential therapeutic targets for cancer immunotherapy. Unlike other spatial omics, cross-modal integration of ST and SM data is challenging due to differences in feature distributions of transcript counts and metabolite intensities, and inherent disparities in spatial morphology and resolution. Furthermore, cross-sample integration is essential for capturing spatial consensus and heterogeneous patterns but is often complicated by batch effects. Here, we introduce SpatialMETA, a conditional variational autoencoder (CVAE)-based framework for cross-modal and cross-sample integration of ST and SM data. SpatialMETA employs tailored decoders and loss functions to enhance modality fusion, batch effect correction and biological conservation, enabling interpretable integration of spatially correlated ST-SM patterns and downstream analysis. SpatialMETA identifies immune spatial clusters with distinct metabolic features in cancer, revealing insights that extend beyond the original study. Compared to existing tools, SpatialMETA demonstrates superior reconstruction capability and fused modality representation, accurately capturing ST and SM feature distributions. In summary, SpatialMETA offers a powerful platform for advancing spatial multi-omics research and refining the understanding of metabolic heterogeneity within the tissue microenvironment.
© 2025. The Author(s).