Multimodal Single-Cell Translation and Alignment with Semi-Supervised Learning

J Comput Biol. 2022 Nov;29(11):1198-1212. doi: 10.1089/cmb.2022.0264. Epub 2022 Oct 14.

Abstract

Single-cell multi-omics technologies enable comprehensive interrogation of cellular regulation, yet most single-cell assays measure only one type of activity-such as transcription, chromatin accessibility, DNA methylation, or 3D chromatin architecture-for each cell. To enable a multimodal view for individual cells, we propose Polarbear, a semi-supervised machine learning framework that facilitates missing modality profile prediction and single-cell cross-modality alignment. Polarbear learns to translate between modalities by using data from co-assay measurements coupled with the large quantity of single-assay data available in public databases. This semi-supervised scheme mitigates issues related to low cell quantities and high sparsity in co-assay data. Polarbear first pre-trains a beta-variational autoencoder for each modality using both co-assay and single-assay profiles to learn robust representations of individual cells, and it then uses the co-assay labels to train a translator between these cell representations. This semi-supervised framework enables us to predict missing modality profiles and match single cells across modalities with improved accuracy compared with fully supervised methods, thus facilitating multimodal data integration.

Keywords: cross-modality translation and multi-omics alignment; single cell multi-omics.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromatin*
  • Databases, Factual
  • Supervised Machine Learning*

Substances

  • Chromatin