Unifying proteomic technologies with ProteinProjector

Bioinform Adv. 2025 Oct 22;5(1):vbaf266. doi: 10.1093/bioadv/vbaf266. eCollection 2025.

Abstract

Summary: Proteomics has developed many approaches to inform the subcellular organization of proteins, each with differing coverage and sensitivity to distinct scales. Here, we develop a self-supervised deep learning framework, ProteinProjector, that flexibly integrates all available data for a protein from any number of modalities, resulting in a unified map of protein position. As initial proof-of-concept we integrate four proteome-wide characterizations of HEK293 human embryonic kidney cells, including protein affinity purification, proximity ligation, and size-exclusion-chromatography mass spectrometry (AP-MS, PL-MS, SEC-MS), as well as protein fluorescent imaging. Map coverage and accuracy grow substantially as new data modes are added, with maximal recovery of known complexes observed when using all four proteomic datasets. We find that ProteinProjector outperforms individual modalities and other integration methods in recovery of orthogonal functional and physical associations not used during training. ProteinProjector provides a foundation for integration of diverse modalities that characterize subcellular structure.

Availability and implementation: ProteinProjector is available as part of the Cell Mapping Toolkit at https://github.com/idekerlab/cellmaps_coembedding.