Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters

Nat Genet. 2002 Jul;31(3):255-65. doi: 10.1038/ng906. Epub 2002 Jun 24.


Genome sequencing has led to the discovery of tens of thousands of potential new genes. Six years after the sequencing of the well-studied yeast Saccharomyces cerevisiae and the discovery that its genome encodes approximately 6,000 predicted proteins, more than 2,000 have not yet been characterized experimentally, and determining their functions seems far from a trivial task. One crucial constraint is the generation of useful hypotheses about protein function. Using a new approach to interpret microarray data, we assign likely cellular functions with confidence values to these new yeast proteins. We perform extensive genome-wide validations of our predictions and offer visualization methods for exploration of the large numbers of functional predictions. We identify potential new members of many existing functional categories including 285 candidate proteins involved in transcription, processing and transport of non-coding RNA molecules. We present experimental validation confirming the involvement of several of these proteins in ribosomal RNA processing. Our methodology can be applied to a variety of genomics data types and organisms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Confidence Intervals
  • Databases, Genetic
  • Fungal Proteins / genetics
  • Fungal Proteins / metabolism
  • Fungal Proteins / physiology*
  • Gene Expression Regulation, Fungal
  • Genome, Fungal
  • Mathematics
  • Oligonucleotide Array Sequence Analysis
  • Open Reading Frames / genetics
  • Phenotype
  • Predictive Value of Tests
  • Probability
  • Protein Processing, Post-Translational / genetics
  • RNA, Ribosomal / genetics
  • RNA, Ribosomal / metabolism
  • Recombinant Fusion Proteins / metabolism
  • Reproducibility of Results
  • Saccharomyces cerevisiae / genetics*
  • Transcription, Genetic / genetics*


  • Fungal Proteins
  • RNA, Ribosomal
  • Recombinant Fusion Proteins