Background: Citrullination, an important post-translational modification of proteins, alters the molecular weight and electrostatic charge of the protein side chains. Citrulline, in protein sequences, is catalyzed by a class of Peptidyl Arginine Deiminases (PADs). Dependent on Ca2+, PADs include five isozymes: PAD 1, 2, 3, 4/5, and 6. Citrullinated proteins have been identified in many biological and pathological processes. Among them, abnormal protein citrullination modification can lead to serious human diseases, including multiple sclerosis and rheumatoid arthritis.
Objective: It is important to identify the citrullination sites in protein sequences. The accurate identification of citrullination sites may contribute to the studies on the molecular functions and pathological mechanisms of related diseases.
Methods and results: In this study, after an encoded training set (containing 116 positive and 348 negative samples) into the feature matrix, the mRMR method was used to analyze the 941- dimensional features which were sorted on the basis of their importance. Then, a predictive model based on a self-normalizing neural network (SNN) was proposed to predict the citrullination sites in protein sequences. Incremental Feature Selection (IFS) and 10-fold cross-validation were used as the model evaluation method. Three classical machine learning models, namely random forest, support vector machine, and k-nearest neighbor algorithm, were selected and compared with the SNN prediction model using the same evaluation methods. SNN may be the best tool for citrullination site prediction. The maximum value of the Matthews Correlation Coefficient (MCC) reached 0.672404 on the basis of the optimal classifier of SNN.
Conclusion: The results showed that the SNN-based prediction methods performed better when evaluated by some common metrics, such as MCC, accuracy, and F1-Measure. SNN prediction model also achieved a better balance in the classification and recognition of positive and negative samples from datasets compared with the other three models.
Keywords: IFS (incremental feature selection); PTM (post-translational modification); SNN (self-normalizing neural network); citrullination site; mRMR (minimum redundancy maximum relevance); protein sequence..
Copyright© Bentham Science Publishers; For any queries, please email at firstname.lastname@example.org.
Predicting Citrullination Sites in Protein Sequences Using mRMR Method and Random Forest AlgorithmQ Zhang et al. Comb Chem High Throughput Screen 20 (2), 164-173. PMID 28029071.We believed that the biological features obtained in this pioneering work would provide some useful insights into the formation and function of citrullination and the opt …
Analysis and Prediction of Myristoylation Sites Using the mRMR Method, the IFS Method and an Extreme Learning Machine AlgorithmS Wang et al. Comb Chem High Throughput Screen 20 (2), 96-106. PMID 28000567.This study provided a new computational method for identifying myristoylation sites in protein sequences. We believe that it can be a useful tool to predict myristoylatio …
Protein Arginine Deiminases (PADs): Biochemistry and Chemical Biology of Protein CitrullinationS Mondal et al. Acc Chem Res 52 (3), 818-832. PMID 30844238.Proteins are well-known to undergo a variety of post-translational modifications (PTMs). One such PTM is citrullination, an arginine modification that is catalyzed by a g …
Peptidyl Arginine Deiminases: Detection and Functional Analysis of Protein CitrullinationR Tilvawala et al. Curr Opin Struct Biol 59, 205-215. PMID 30833201. - ReviewCitrullination is a post-translational modification of arginine that is catalyzed by the protein arginine deiminases (PADs). Abnormal citrullination is observed in many a …
An Interplay of Structure and Intrinsic Disorder in the Functionality of Peptidylarginine Deiminases, a Family of Key Autoimmunity-Related EnzymesM Alghamdi et al. Cell Mol Life Sci 76 (23), 4635-4662. PMID 31342121. - ReviewCitrullination is a post-translation modification of proteins, where the proteinaceous arginine residues are converted to non-coded citrulline residues. The immune tolera …