Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor

BMC Bioinformatics. 2015 Sep 18:16:297. doi: 10.1186/s12859-015-0741-7.


Background: Two component systems (TCS) are signalling complexes manifested by a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes and control a wide range of biological processes. The pairing of these two components is highly specific, often requiring costly and time-consuming experimental characterisation. Therefore, there is considerable interest in developing accurate prediction tools to lessen the burden of experimental work and cope with the ever-increasing amount of genomic information.

Results: We present a novel meta-predictor, MetaPred2CS, which is based on a support vector machine. MetaPred2CS integrates six sequence-based prediction methods: in-silico two-hybrid, mirror-tree, gene fusion, phylogenetic profiling, gene neighbourhood, and gene operon. To benchmark MetaPred2CS, we also compiled a novel high-quality training dataset of experimentally deduced TCS protein pairs for k-fold cross validation, to act as a gold standard for TCS partnership predictions. Combining individual predictions using MetaPred2CS improved performance when compared to the individual methods and in comparison with a current state-of-the-art meta-predictor.

Conclusion: We have developed MetaPred2CS, a support vector machine-based metapredictor for prokaryotic TCS protein pairings. Central to the success of MetaPred2CS is a strategy of integrating individual predictors that improves the overall prediction accuracy, with the in-silico two-hybrid method contributing most to performance. MetaPred2CS outperformed other available systems in our benchmark tests, and is available online at, along with our gold standard dataset of TCS interaction pairs.

MeSH terms

  • Area Under Curve
  • Bacteria / genetics
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / metabolism
  • Genome, Bacterial
  • Histidine Kinase
  • Protein Interaction Maps
  • Protein Kinases / chemistry
  • Protein Kinases / metabolism
  • ROC Curve
  • Support Vector Machine*


  • Bacterial Proteins
  • Protein Kinases
  • Histidine Kinase