Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE
- PMID: 16873464
- DOI: 10.1093/bioinformatics/btl223
Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE
Abstract
Motivation: Regulation of gene expression by a transcription factor requires physical interaction between the factor and the DNA, which can be described by a statistical mechanical model. Based on this model, we developed the MatrixREDUCE algorithm, which uses genome-wide occupancy data for a transcription factor (e.g. ChIP-chip) and associated nucleotide sequences to discover the sequence-specific binding affinity of the transcription factor. Advantages of our approach are that the information for all probes on the microarray is efficiently utilized because there is no need to delineate "bound" and "unbound" sequences, and that, unlike information content-based methods, it does not require a background sequence model.
Results: We validated the performance of MatrixREDUCE by inferring the sequence-specific binding affinities for several transcription factors in S. cerevisiae and comparing the results with three other independent sources of transcription factor sequence-specific affinity information: (i) experimental measurement of transcription factor binding affinities for specific oligonucleotides, (ii) reporter gene assays for promoters with systematically mutated binding sites, and (iii) relative binding affinities obtained by modeling transcription factor-DNA interactions based on co-crystal structures of transcription factors bound to DNA substrates. We show that transcription factor binding affinities inferred by MatrixREDUCE are in good agreement with all three validating methods.
Availability: MatrixREDUCE source code is freely available for non-commercial use at http://www.bussemakerlab.org/. The software runs on Linux, Unix, and Mac OS X.
Similar articles
-
Predicting transcription factor affinities to DNA from a biophysical model.Bioinformatics. 2007 Jan 15;23(2):134-41. doi: 10.1093/bioinformatics/btl565. Epub 2006 Nov 10. Bioinformatics. 2007. PMID: 17098775
-
Context-specific independence mixture modeling for positional weight matrices.Bioinformatics. 2006 Jul 15;22(14):e166-73. doi: 10.1093/bioinformatics/btl249. Bioinformatics. 2006. PMID: 16873468
-
Informative priors based on transcription factor structural class improve de novo motif discovery.Bioinformatics. 2006 Jul 15;22(14):e384-92. doi: 10.1093/bioinformatics/btl251. Bioinformatics. 2006. PMID: 16873497
-
Eukaryotic transcription factor binding sites--modeling and integrative search methods.Bioinformatics. 2008 Jun 1;24(11):1325-31. doi: 10.1093/bioinformatics/btn198. Epub 2008 Apr 21. Bioinformatics. 2008. PMID: 18426806 Review.
-
The relative value of operon predictions.Brief Bioinform. 2008 Sep;9(5):367-75. doi: 10.1093/bib/bbn019. Epub 2008 Apr 17. Brief Bioinform. 2008. PMID: 18420711 Review.
Cited by
-
IGAP-integrative genome analysis pipeline reveals new gene regulatory model associated with nonspecific TF-DNA binding affinity.Comput Struct Biotechnol J. 2020 Jun 2;18:1270-1286. doi: 10.1016/j.csbj.2020.05.024. eCollection 2020. Comput Struct Biotechnol J. 2020. PMID: 32612751 Free PMC article.
-
P-value-based regulatory motif discovery using positional weight matrices.Genome Res. 2013 Jan;23(1):181-94. doi: 10.1101/gr.139881.112. Epub 2012 Sep 18. Genome Res. 2013. PMID: 22990209 Free PMC article.
-
G-Quadruplexes act as sequence-dependent protein chaperones.EMBO Rep. 2020 Oct 5;21(10):e49735. doi: 10.15252/embr.201949735. Epub 2020 Sep 18. EMBO Rep. 2020. PMID: 32945124 Free PMC article.
-
RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.Bioinformatics. 2016 Jun 15;32(12):i351-i359. doi: 10.1093/bioinformatics/btw259. Bioinformatics. 2016. PMID: 27307637 Free PMC article.
-
Identifying the genetic determinants of transcription factor activity.Mol Syst Biol. 2010 Sep 21;6:412. doi: 10.1038/msb.2010.64. Mol Syst Biol. 2010. PMID: 20865005 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
