EDCNN: Identification of Genome-Wide RNA-binding Proteins Using Evolutionary Deep Convolutional Neural Network

Bioinformatics. 2021 Oct 25;btab739. doi: 10.1093/bioinformatics/btab739. Online ahead of print.

Abstract

Motivation: RNA binding proteins (RBPs) are a group of proteins associated with RNA regulation and metabolism, and play an essential role in mediating the maturation, transport, localization and translation of RNA. Recently, Genome-wide RNA-binding event detection methods have been developed to predict RBPs. Unfortunately, the existing computational methods usually suffer some limitations, such as high-dimensionality, data sparsity and low model performance.

Results: Deep convolution neural network has a useful advantage for solving high-dimensional and sparse data. To improve further the performance of deep convolution neural network, we propose evolutionary deep convolutional neural network (EDCNN) to identify protein-RNA interactions by synergizing evolutionary optimization with gradient descent to enhance deep conventional neural network. In particular, EDCNN combines evolutionary algorithms and different gradient descent models in a complementary algorithm, where the gradient descent and evolution steps can alternately optimize the RNA-binding event search. To validate the performance of EDCNN, an experiment is conducted on two large-scale CLIP-seq datasets, and results reveal that EDCNN provides superior performance to other state-of-the-art methods. Furthermore, time complexity analysis, parameter analysis and motif analysis are conducted to demonstrate the effectiveness of our proposed algorithm from several perspectives.

Availability: The EDCNN algorithm is available at GitHub: https://github.com/yaweiwang1232/EDCNN. Both the software and the supporting data can be downloaded from: https://figshare.com/articles/software/EDCNN/16803217.

Supplementary information: Supplementary data are available at Bioinformatics online.