Sense codons are found in specific contexts

J Mol Biol. 1985 Apr 20;182(4):529-40. doi: 10.1016/0022-2836(85)90239-6.

Abstract

The sequence environment of codons in structural genes has been investigated statistically, using computer methods. A set of Escherichia coli genes with abundant products was compared with a set having low gene product levels, in order to detect potential differences associated with expression. The results show striking non-randomness in the nucleotides occurring near codons. These effects are, unexpectedly, very much larger and more homogeneous among the genes with rare products. The intensity of effects in weakly expressed genes suggests that such non-random sequence environments decrease expression. In the weakly expressed set of genes, the 5' neighbor of a codon, and all positions of the 3' neighbor codon are biased. In the highly expressed genes, the first nucleotide of the next codon is a uniquely affected site. The distribution of non-randomness in weakly expressed genes suggests that sequence bias is primarily due to a constraint acting directly on the secondary or tertiary structure of the codon/anticodon. In highly expressed genes, the observed bias suggests an interaction between the codon/anticodon and a site outside the codon/anticodon. Much of the tendency to non-random near-neighbor sequences in weakly expressed genes can be ascribed to a correlation between nearby nucleotides and the wobble nucleotide of the codon, despite the fact that selection of such correlations will alter the amino acid sequence. The favored pattern, in genes expressed at low level, is R YYR or Y RRY. R indicates purine, Y indicates pyrimidine; the space is the boundary between codons. It seems likely that this preference for nearby sequences is the physical basis of the genetic context effect. Under this assumption such sequence biases will affect expression. On this basis, we predict new sites for contextual mutations which decrease expression, and suggest strategy for the design of messages having optimal translational activity.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acids / analysis
  • Base Sequence
  • Codon*
  • Computers
  • DNA, Bacterial
  • Escherichia coli / genetics
  • Gene Expression Regulation
  • Genes
  • Genes, Bacterial
  • Genetic Code*
  • Probability
  • RNA, Messenger*

Substances

  • Amino Acids
  • Codon
  • DNA, Bacterial
  • RNA, Messenger