FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou's Five-Step Rule

Int J Mol Sci. 2019 Aug 26;20(17):4175. doi: 10.3390/ijms20174175.

Abstract

DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent years, many computation-based approachs have been proposed to detect DNA-binding proteins. In this paper, a machine learning-based method, which is called the Fuzzy Kernel Ridge Regression model based on Multi-View Sequence Features (FKRR-MVSF), is proposed to identifying DNA-binding proteins. First of all, multi-view sequence features are extracted from protein sequences. Next, a Multiple Kernel Learning (MKL) algorithm is employed to combine multiple features. Finally, a Fuzzy Kernel Ridge Regression (FKRR) model is built to detect DNA-binding proteins. Compared with other methods, our model achieves good results. Our method obtains an accuracy of 83.26% and 81.72% on two benchmark datasets (PDB1075 and compared with PDB186), respectively.

Keywords: DNA-binding proteins prediction; feature extraction; fuzzy kernel ridge regression; multiple kernel learning; protein sequence.

MeSH terms

  • Computational Biology* / methods
  • DNA-Binding Proteins / chemistry*
  • DNA-Binding Proteins / metabolism*
  • Machine Learning*
  • ROC Curve
  • Regression Analysis*
  • Reproducibility of Results
  • Sensitivity and Specificity

Substances

  • DNA-Binding Proteins