Protein language models (PLMs) trained solely on sequence data have significantly advanced our understanding of protein biology and achieved remarkable performance in protein prediction tasks. However, their lack of three-dimensional (3D) structural features limits their predictive power in applications that rely heavily on 3D conformation. To address this limitation, we developed two structure-aware PLMs, S-PLM1 and S-PLM2, that employ multi-view contrastive learning to align protein sequences with their 3D structures in a unified latent space. S-PLM1 represents structural information using contact maps encoded by a pretrained Swin-Transformer, while S-PLM2 directly encodes 3D backbone coordinates through a Geometric Vector Perceptron (GVP)-based model. The paired sequence-structure data were obtained from AlphaFoldDB. For both models, we designed efficient tuning strategies that enable optimal performance with minimal computational cost. Here, we present detailed protocols for adapting S-PLM1 and S-PLM2 for diverse protein applications. The protocols provide step-by-step guidance on generating structure-aware representations from S-PLMs, fine-tuning them for various protein prediction tasks, and using S-PLM2 to produce structure embeddings for structure-based downstream analyses. We also provide source code and Google Colab implementations for easy customization and deployment. © 2026 Wiley Periodicals LLC. Basic Protocol 1: Generating structure-aware representations of protein sequences Basic Protocol 2: Efficient tuning of structure-aware protein language models for diverse protein applications Basic Protocol 3: Using S-PLM2 to generate protein structure representations and conduct structure-based clustering Support Protocol: Google Colab quick start notebooks.
Keywords: protein analyses; protein language model; protein prediction; protein sequence; protein structure.
© 2026 Wiley Periodicals LLC.