Seed-based systematic discovery of specific transcription factor target genes

FEBS J. 2008 Jun;275(12):3178-92. doi: 10.1111/j.1742-4658.2008.06471.x. Epub 2008 May 13.


Reliable prediction of specific transcription factor target genes is a major challenge in systems biology and functional genomics. Current sequence-based methods yield many false predictions, due to the short and degenerated DNA-binding motifs. Here, we describe a new systematic genome-wide approach, the seed-distribution-distance method, that searches large-scale genome-wide expression data for genes that are similarly expressed as known targets. This method is used to identify genes that are likely targets, allowing sequence-based methods to focus on a subset of genes, giving rise to fewer false-positive predictions. We show by cross-validation that this method is robust in recovering specific target genes. Furthermore, this method identifies genes with typical functions and binding motifs of the seed. The method is illustrated by predicting novel targets of the transcription factor nuclear factor kappaB (NF-kappaB). Among the new targets is optineurin, which plays a key role in the pathogenesis of acquired blindness caused by adult-onset primary open-angle glaucoma. We show experimentally that the optineurin gene and other predicted genes are targets of NF-kappaB. Thus, our data provide a missing link in the signalling of NF-kappaB and the damping function of optineurin in signalling feedback of NF-kappaB. We present a robust and reliable method to enhance the genome-wide prediction of specific transcription factor target genes that exploits the vast amount of expression information available in public databases today.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Binding Sites
  • Cell Cycle Proteins
  • Databases, Genetic
  • Gene Expression Profiling*
  • Gene Expression Regulation*
  • Genomics / methods
  • Humans
  • Membrane Transport Proteins
  • NF-kappa B / metabolism*
  • Oligonucleotide Array Sequence Analysis
  • Promoter Regions, Genetic*
  • Transcription Factor TFIIIA / genetics


  • Cell Cycle Proteins
  • Membrane Transport Proteins
  • NF-kappa B
  • OPTN protein, human
  • Transcription Factor TFIIIA