Prediction of regulatory interactions from genome sequences using a biophysical model for the Arabidopsis LEAFY transcription factor

Plant Cell. 2011 Apr;23(4):1293-306. doi: 10.1105/tpc.111.083329. Epub 2011 Apr 22.


Despite great advances in sequencing technologies, generating functional information for nonmodel organisms remains a challenge. One solution lies in an improved ability to predict genetic circuits based on primary DNA sequence in combination with detailed knowledge of regulatory proteins that have been characterized in model species. Here, we focus on the LEAFY (LFY) transcription factor, a conserved master regulator of floral development. Starting with biochemical and structural information, we built a biophysical model describing LFY DNA binding specificity in vitro that accurately predicts in vivo LFY binding sites in the Arabidopsis thaliana genome. Applying the model to other plant species, we could follow the evolution of the regulatory relationship between LFY and the AGAMOUS (AG) subfamily of MADS box genes and show that this link predates the divergence between monocots and eudicots. Remarkably, our model succeeds in detecting the connection between LFY and AG homologs despite extensive variation in binding sites. This demonstrates that the cis-element fluidity recently observed in animals also exists in plants, but the challenges it poses can be overcome with predictions grounded in a biophysical model. Therefore, our work opens new avenues to deduce the structure of regulatory networks from mere inspection of genomic sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • AGAMOUS Protein, Arabidopsis / genetics
  • AGAMOUS Protein, Arabidopsis / metabolism
  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics*
  • Base Sequence
  • Binding Sites
  • Biophysical Phenomena*
  • Chromatin Immunoprecipitation
  • DNA, Plant / genetics
  • Evolution, Molecular
  • Flowers / genetics
  • Flowers / growth & development
  • Gene Expression Regulation, Plant*
  • Genes, Plant / genetics
  • Genome, Plant / genetics*
  • Introns / genetics
  • Models, Genetic*
  • Molecular Sequence Data
  • Protein Binding
  • Regulatory Sequences, Nucleic Acid / genetics
  • Reproducibility of Results
  • Transcription Factors / genetics*


  • AGAMOUS Protein, Arabidopsis
  • Arabidopsis Proteins
  • DNA, Plant
  • LFY protein, Arabidopsis
  • Transcription Factors