iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators

Bioinformatics. 2019 May 1;35(9):1469-1477. doi: 10.1093/bioinformatics/bty827.

Abstract

Motivation: Transcription termination is an important regulatory step of gene expression. If there is no terminator in gene, transcription could not stop, which will result in abnormal gene expression. Detecting such terminators can determine the operon structure in bacterial organisms and improve genome annotation. Thus, accurate identification of transcriptional terminators is essential and extremely important in the research of transcription regulations.

Results: In this study, we developed a new predictor called 'iTerm-PseKNC' based on support vector machine to identify transcription terminators. The binomial distribution approach was used to pick out the optimal feature subset derived from pseudo k-tuple nucleotide composition (PseKNC). The 5-fold cross-validation test results showed that our proposed method achieved an accuracy of 95%. To further evaluate the generalization ability of 'iTerm-PseKNC', the model was examined on independent datasets which are experimentally confirmed Rho-independent terminators in Escherichia coli and Bacillus subtilis genomes. As a result, all the terminators in E. coli and 87.5% of the terminators in B. subtilis were correctly identified, suggesting that the proposed model could become a powerful tool for bacterial terminator recognition.

Availability and implementation: For the convenience of most of wet-experimental researchers, the web-server for 'iTerm-PseKNC' was established at http://lin-group.cn/server/iTerm-PseKNC/, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacillus subtilis
  • Escherichia coli
  • Nucleotides
  • Operon
  • Software
  • Transcription, Genetic*

Substances

  • Nucleotides