RNA-binding proteins (RBPs) play a pivotal role in the regulation of gene expression, with their interactions with RNA reflecting the biological functions and regulatory mechanisms. However, current computational methods are typically tailored to specific RBPs and depend on specific protocols and batches of biological experiments. To overcome these challenges, we propose a method called PaRPI, which aims to predict RNA-protein binding sites in a bidirectional RBP-RNA selection manner. PaRPI groups all RBP datasets based on cell lines, integrating experimental data from different protocols and batches, thereby enabling the development of a unified computational model that effectively captures both shared and distinct interaction patterns among different proteins. Our results demonstrate that PaRPI achieves exceptional performance in accurately identifying binding sites, surpassing state-of-the-art models on 261 RBP datasets from eCLIP and CLIP-seq experiments. Furthermore, PaRPI stands out for its robust generalization capabilities, uniquely able to predict interactions with previously unseen RNA and protein receptors. We also investigate the impact of disease-associated variants on RBP binding and evaluate PaRPI's components and semantic embeddings, demonstrating its capability to dissect complex interaction networks. PaRPI enables large-scale exploration of RNA-protein interactions, facilitating future studies on gene regulation and disease mechanisms.
© 2025. The Author(s).