Background: Though most of the transcripts are long non-coding RNAs (lncRNAs), little is known about their functions. lncRNAs usually function through interactions with proteins, which implies the importance of identifying the binding proteins of lncRNAs in understanding the molecular mechanisms underlying the functions of lncRNAs. Only a few approaches are available for predicting interactions between lncRNAs and proteins. In this study, we introduce a new method lncPro.
Results: By encoding RNA and protein sequences into numeric vectors, we used matrix multiplication to score each RNA-protein pair. This score can be used to measure the interactions between an RNA-protein pair. This method effectively discriminates interacting and non-interacting RNA-protein pairs and predicts RNA-protein interactions within a given complex. Applying this method on all human proteins, we found that the long non-coding RNAs we collected tend to interact with nuclear proteins and RNA-binding proteins.
Conclusions: Compared with the existing approaches, our method shortens the time for training matrix and obtains optimal results based on the model being used. The ability of predicting the associations between lncRNAs and proteins has also been enhanced. Our method provides an idea on how to integrate different information into the prediction process.