PredCRP: predicting and analysing the regulatory roles of CRP from its binding sites in Escherichia coli

Sci Rep. 2018 Jan 17;8(1):951. doi: 10.1038/s41598-017-18648-5.

Abstract

Cyclic AMP receptor protein (CRP), a global regulator in Escherichia coli, regulates more than 180 genes via two roles: activation and repression. Few methods are available for predicting the regulatory roles from the binding sites of transcription factors. This work proposes an accurate method PredCRP to derive an optimised model (named PredCRP-model) and a set of four interpretable rules (named PredCRP-ruleset) for predicting and analysing the regulatory roles of CRP from sequences of CRP-binding sites. A dataset consisting of 169 CRP-binding sites with regulatory roles strongly supported by evidence was compiled. The PredCRP-model, using 12 informative features of CRP-binding sites, and cooperating with a support vector machine achieved a training and test accuracy of 0.98 and 0.93, respectively. PredCRP-ruleset has two activation rules and two repression rules derived using the 12 features and the decision tree method C4.5. This work further screened and identified 23 previously unobserved regulatory interactions in Escherichia coli. Using quantitative PCR for validation, PredCRP-model and PredCRP-ruleset achieved a test accuracy of 0.96 (=22/23) and 0.91 (=21/23), respectively. The proposed method is suitable for designing predictors for regulatory roles of all global regulators in Escherichia coli. PredCRP can be accessed at https://github.com/NctuICLab/PredCRP .

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites / physiology*
  • Cyclic AMP / metabolism
  • Cyclic AMP Receptor Protein / metabolism*
  • DNA, Bacterial / genetics
  • Escherichia coli / metabolism*
  • Escherichia coli Proteins / metabolism*
  • Gene Expression Regulation, Bacterial / genetics
  • Protein Binding / physiology
  • Transcription Factors / metabolism

Substances

  • Cyclic AMP Receptor Protein
  • DNA, Bacterial
  • Escherichia coli Proteins
  • Transcription Factors
  • Cyclic AMP