Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations

Bioinformatics. 2006 May 15;22(10):1190-7. doi: 10.1093/bioinformatics/btl041. Epub 2006 Feb 10.


Motivation: Functionally related genes involved in the same molecular-genetic, biochemical or physiological process are often regulated coordinately. Such regulation is provided by precisely organized binding of a multiplicity of special proteins [transcription factors (TFs)] to their target sites (cis-elements) in regulatory regions of genes. Cis-element combinations provide a structural basis for the generation of unique patterns of gene expression.

Results: Here we present a new approach for defining promoter models based on the composition of TF binding sites and their pairs. We utilize a multicomponent fitness function for selection of the promoter model that fits best to the observed gene expression profile. We demonstrate examples of successful application of the fitness function with the help of a genetic algorithm for the analysis of functionally related or co-expressed genes as well as testing on simulated and permutated data.

Availability: The CMA program is freely available for non-commercial users. URL It is also a part of the commercial system ExPlain ( designed for causal analysis of gene expression data..

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Binding Sites
  • Computer Simulation
  • Models, Chemical*
  • Molecular Sequence Data
  • Promoter Regions, Genetic / genetics
  • Protein Binding
  • Sequence Analysis, Protein / methods*
  • Software*
  • Transcription Factors / chemistry*
  • Transcription Factors / genetics*


  • Transcription Factors