A multivariate prediction model for Rho-dependent termination of transcription

Nucleic Acids Res. 2018 Sep 19;46(16):8245-8260. doi: 10.1093/nar/gky563.

Abstract

Bacterial transcription termination proceeds via two main mechanisms triggered either by simple, well-conserved (intrinsic) nucleic acid motifs or by the motor protein Rho. Although bacterial genomes can harbor hundreds of termination signals of either type, only intrinsic terminators are reliably predicted. Computational tools to detect the more complex and diversiform Rho-dependent terminators are lacking. To tackle this issue, we devised a prediction method based on Orthogonal Projections to Latent Structures Discriminant Analysis [OPLS-DA] of a large set of in vitro termination data. Using previously uncharacterized genomic sequences for biochemical evaluation and OPLS-DA, we identified new Rho-dependent signals and quantitative sequence descriptors with significant predictive value. Most relevant descriptors specify features of transcript C>G skewness, secondary structure, and richness in regularly-spaced 5'CC/UC dinucleotides that are consistent with known principles for Rho-RNA interaction. Descriptors collectively warrant OPLS-DA predictions of Rho-dependent termination with a ∼85% success rate. Scanning of the Escherichia coli genome with the OPLS-DA model identifies significantly more termination-competent regions than anticipated from transcriptomics and predicts that regions intrinsically refractory to Rho are primarily located in open reading frames. Altogether, this work delineates features important for Rho activity and describes the first method able to predict Rho-dependent terminators in bacterial genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Escherichia coli Proteins / genetics*
  • Escherichia coli Proteins / metabolism
  • Gene Expression Regulation, Bacterial
  • Genome, Bacterial / genetics*
  • Genomics / methods*
  • Models, Genetic
  • Multivariate Analysis
  • Rho Factor / genetics*
  • Rho Factor / metabolism
  • Transcription Termination, Genetic*

Substances

  • Escherichia coli Proteins
  • Rho Factor