Visualizing associations between genome sequences and gene expression data using genome-mean expression profiles

Bioinformatics. 2001:17 Suppl 1:S49-55. doi: 10.1093/bioinformatics/17.suppl_1.s49.

Abstract

The combination of genome-wide expression patterns and full genome sequences offers a great opportunity to further our understanding of the mechanisms and logic of transcriptional regulation. Many methods have been described that identify sequence motifs enriched in transcription control regions of genes that share similar gene expression patterns. Here we present an alternative approach that evaluates the transcriptional information contained by specific sequence motifs by computing for each motif the mean expression profile of all genes that contain the motif in their transcription control regions. These genome-mean expression profiles (GMEP's) are valuable for visualizing the relationship between genome sequences and gene expression data, and for characterizing the transcriptional importance of specific sequence motifs. Analysis of GMEP's calculated from a dataset of 519 whole-genome microarray experiments in Saccharomyces cerevisiae show a significant correlation between GMEP's of motifs that are reverse complements, a result that supports the relationship between GMEP's and transcriptional regulation. Hierarchical clustering of GMEP's identifies clusters of motifs that correspond to binding sites of well-characterized transcription factors. The GMEP's of these clustered motifs have patterns of variation across conditions that reflect the known activities of these transcription factors. Software that computed GMEP's from sequence and gene expression data is available under the terms of the Gnu Public License from http://rana.lbl.gov/.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Cluster Analysis
  • Computational Biology
  • DNA, Fungal / genetics
  • Gene Expression Profiling / statistics & numerical data*
  • Genome*
  • Genome, Fungal
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Saccharomyces cerevisiae / genetics
  • Software

Substances

  • DNA, Fungal