The elements for a classification of units of genetic information with a combinatorial component

J Theor Biol. 1993 Aug 21;163(4):527-48. doi: 10.1006/jtbi.1993.1136.


An integrative approach to the study of the regulation of gene expression has been undertaken here. The main goal of this approach is to make explicit the common rules that govern the relative location of regulatory sites within operons and other units of genetic information (UGIs). A classification that emphasizes the regulatory properties of UGIs can be achieved by partitioning UGIs into short sequences with defined properties. Such a classification scheme can be precisely defined as a Grammar with a component of combinatorial (rewriting) rules, and a dictionary component. Sequences have then to be grouped into classes such that any sequence of the same class can mutually substitute and produce novel regulatable UGIs. It is shown here that individual nucleotides cannot define such classes--they are far from equivalent to phonemes. Neither pairs, triplets or any short sequence with a defined number of nucleotides can define productive substitutions. Defined sequences like promoter, operator and activator binding sites are the smallest elements of combinatorial rules within the defined range of transcription initiation of sigma 70 Escherichia coli promoters.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Gene Expression Regulation / genetics*
  • Models, Genetic*
  • Operator Regions, Genetic / genetics
  • Operon / genetics
  • Promoter Regions, Genetic / genetics