Predicting gene expression from sequence
- PMID: 15084257
- DOI: 10.1016/s0092-8674(04)00304-6
Predicting gene expression from sequence
Abstract
We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constraints that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.
Similar articles
-
Bayesian variable selection for gene expression modeling with regulatory motif binding sites in neuroinflammatory events.Neuroinformatics. 2006 Winter;4(1):95-117. doi: 10.1385/NI:4:1:95. Neuroinformatics. 2006. PMID: 16595861
-
Computational discovery of transcriptional regulatory rules.Bioinformatics. 2005 Sep 1;21 Suppl 2:ii101-7. doi: 10.1093/bioinformatics/bti1117. Bioinformatics. 2005. PMID: 16204087
-
Inferring genetic regulatory logic from expression data.Bioinformatics. 2005 Jun 1;21(11):2706-13. doi: 10.1093/bioinformatics/bti388. Epub 2005 Mar 22. Bioinformatics. 2005. PMID: 15784747
-
Predicting genetic regulatory response using classification.Bioinformatics. 2004 Aug 4;20 Suppl 1:i232-40. doi: 10.1093/bioinformatics/bth923. Bioinformatics. 2004. PMID: 15262804
-
Expression profiling and comparative sequence derived insights into lipid metabolism.Curr Opin Lipidol. 2002 Apr;13(2):173-9. doi: 10.1097/00041433-200204000-00009. Curr Opin Lipidol. 2002. PMID: 11891420 Review.
Cited by
-
Genome-wide prediction and functional validation of promoter motifs regulating gene expression in spore and infection stages of Phytophthora infestans.PLoS Pathog. 2013 Mar;9(3):e1003182. doi: 10.1371/journal.ppat.1003182. Epub 2013 Mar 14. PLoS Pathog. 2013. PMID: 23516354 Free PMC article.
-
Genome-scale identification of cell-wall related genes in Arabidopsis based on co-expression network analysis.BMC Plant Biol. 2012 Aug 9;12:138. doi: 10.1186/1471-2229-12-138. BMC Plant Biol. 2012. PMID: 22877077 Free PMC article.
-
Systematic dissection of the sequence determinants of gene 3' end mediated expression control.PLoS Genet. 2015 Apr 15;11(4):e1005147. doi: 10.1371/journal.pgen.1005147. eCollection 2015 Apr. PLoS Genet. 2015. PMID: 25875337 Free PMC article.
-
bZIPDB: a database of regulatory information for human bZIP transcription factors.BMC Genomics. 2007 May 30;8:136. doi: 10.1186/1471-2164-8-136. BMC Genomics. 2007. PMID: 17535445 Free PMC article.
-
Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein.BMC Bioinformatics. 2005 Mar 17;6:59. doi: 10.1186/1471-2105-6-59. BMC Bioinformatics. 2005. PMID: 15773999 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Molecular Biology Databases
