Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Nov 1;102(44):15936-41.
doi: 10.1073/pnas.0505537102. Epub 2005 Oct 19.

Evolutionary population genetics of promoters: predicting binding sites and functional phylogenies

Affiliations

Evolutionary population genetics of promoters: predicting binding sites and functional phylogenies

Ville Mustonen et al. Proc Natl Acad Sci U S A. .

Abstract

We study the evolution of transcription factor-binding sites in prokaryotes, using an empirically grounded model with point mutations and genetic drift. Selection acts on the site sequence via its binding affinity to the corresponding transcription factor. Calibrating the model with populations of functional binding sites, we verify this form of selection and show that typical sites are under substantial selection pressure for functionality: for cAMP response protein sites in Escherichia coli, the product of fitness difference and effective population size takes values 2NDeltaF of order 10. We apply this model to cross-species comparisons of binding sites in bacteria and obtain a prediction method for binding sites that uses evolutionary information in a quantitative way. At the same time, this method predicts the functional histories of orthologous sites in a phylogeny, evaluating the likelihood for conservation or loss or gain of function during evolution. We have performed, as an example, a cross-species analysis of E. coli, Salmonella typhimurium, and Yersinia pseudotuberculosis. Detailed lists of predicted sites and their functional phylogenies are available.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Cross-species energy transition probabilities formula image for neutral evolution (blue) and formula image for evolution under time-independent selection in the fitness landscape of Fig. 3a (red). Curves are shown for fixed initial energy E1 = 8 and various evolutionary distances t. The third curve in each family belongs to the distance between aligned loci of E. coli and S. typhimurium. Typical loci evolve toward weaker binding under neutrality but maintain their binding energy under selection.
Fig. 2.
Fig. 2.
Functional phylogenies for two species at evolutionary distances t1 and t2, counted from their last common ancestor at time ta = 0. Branch segments with neutral evolution are shown in blue with evolution under selection in red. (a) Neutral evolution of nonfunctional loci, described by the energy pair distribution formula image. (b) Evolution of functional loci under time-independent selection, described by the distribution Qt. (c) Evolution under time-dependent selection generating a functional locus in species 1 and a nonfunctional locus in species 2, described by the distribution formula image. This mode involves either a gain of function between ancestor and 1 or a loss of function between ancestor and 2. The switching event at time t′ is denoted by a green arrow. The corresponding mode where the roles of the two species are interchanged is described by the distribution formula image.
Fig. 3.
Fig. 3.
Energy statistics and fitness landscape for CRP-binding loci in E. coli.(a) Count histogram with energy bins of width 0.1 (black), expected background counts (blue), and excess counts above background (red), with a 30-fold zoom into the region E < 14. The color bar indicates the probability of functionalityρQ(E), ranging from 1 (red) to 0 (blue). (b) Decomposition of the counts (log-scale, left y axis) according to the single-species hidden Markov model: background distribution (1 - λ)P0(E) (blue), distribution λQ(E) of functional loci (red), and total distribution W(E) (orange). The resulting fitness landscape ΔF(E) according to Eq. 6 is also shown in orange (thick curve, right y axis).
Fig. 4.
Fig. 4.
Binding energy pairs and functional histories for aligned loci in E. coli and S. typhimurium.(a) Dot plot of counts (E1, E2), including verified binding sites (light blue). The background color shading indicates the likelihood of functional histories, varying between blue (conserved neutrality), red (conserved function), and green (functional switching). Isoprobability lines formula image (α = 0, Q, 0s, s0) are dotted. (b) Energy pair density obtained from the counts (filled contours), compared with the distribution Wt(E1, E2) (contour lines, Wt = 10-7, 10-6, 10-5, 10-4).

Similar articles

Cited by

References

    1. Ptashne, M. & Gann, A. (2002) Genes and Signals (Cold Spring Harbor Lab. Press, Woodbury, NY).
    1. Wray, G. A., Hahn, M. W., Abouheif, H., Balhoff, J. P., Pizer, M., Rockman, M. V. & Romano, R. A. (2003) Mol. Biol. Evol. 20, 1377-1419. - PubMed
    1. Rajewsky, N., Socci, N. D., Zapotocky, M. & Siggia, E. D. (2002) Genet. Res. 12, 298-308. - PMC - PubMed
    1. McCue, L. A., Thompson, W., Carmack, C. S. & Lawrence, C. E. (2002) Genet. Res. 12, 1523-1532. - PMC - PubMed
    1. Lenhard, B., Sandelin, A., Mendoza, L., Engström, P., Jareborg, N. & Wasserman, W. W. (2003) J. Biol. 2, 13. - PMC - PubMed

Publication types

Substances

LinkOut - more resources