Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;9(7):e1003135.
doi: 10.1371/journal.pcbi.1003135. Epub 2013 Jul 11.

Genetic Selection for Context-Dependent Stochastic Phenotypes: Sp1 and TATA Mutations Increase Phenotypic Noise in HIV-1 Gene Expression

Free PMC article

Genetic Selection for Context-Dependent Stochastic Phenotypes: Sp1 and TATA Mutations Increase Phenotypic Noise in HIV-1 Gene Expression

Kathryn Miller-Jensen et al. PLoS Comput Biol. .
Free PMC article


The sequence of a promoter within a genome does not uniquely determine gene expression levels and their variability; rather, promoter sequence can additionally interact with its location in the genome, or genomic context, to shape eukaryotic gene expression. Retroviruses, such as human immunodeficiency virus-1 (HIV), integrate their genomes into those of their host and thereby provide a biomedically-relevant model system to quantitatively explore the relationship between promoter sequence, genomic context, and noise-driven variability on viral gene expression. Using an in vitro model of the HIV Tat-mediated positive-feedback loop, we previously demonstrated that fluctuations in viral Tat-transactivating protein levels generate integration-site-dependent, stochastically-driven phenotypes, in which infected cells randomly 'switch' between high and low expressing states in a manner that may be related to viral latency. Here we extended this model and designed a forward genetic screen to systematically identify genetic elements in the HIV LTR promoter that modulate the fraction of genomic integrations that specify 'Switching' phenotypes. Our screen identified mutations in core promoter regions, including Sp1 and TATA transcription factor binding sites, which increased the Switching fraction several fold. By integrating single-cell experiments with computational modeling, we further investigated the mechanism of Switching-fraction enhancement for a selected Sp1 mutation. Our experimental observations demonstrated that the Sp1 mutation both impaired Tat-transactivated expression and also altered basal expression in the absence of Tat. Computational analysis demonstrated that the observed change in basal expression could contribute significantly to the observed increase in viral integrations that specify a Switching phenotype, provided that the selected mutation affected Tat-mediated noise amplification differentially across genomic contexts. Our study thus demonstrates a methodology to identify and characterize promoter elements that affect the distribution of stochastic phenotypes over genomic contexts, and advances our understanding of how promoter mutations may control the frequency of latent HIV infection.

Conflict of interest statement

The authors have declared that no competing interests exist.


Figure 1
Figure 1. An in vitro model of HIV gene expression exhibits a distribution of integration-site-dependent phenotypes, including noise-driven Switching phenotypes.
(A) Schematic of the full-length HIV lentiviral model of the Tat-mediated positive feedback loop (sLTR-Tat-GFP). Viral proteins other than Tat were inactivated and Nef was replaced with GFP. (B–C) Flow cytometry histogram of Jurkat cells infected with a single HIV WT virus for (B) a bulk population with mixed integration positions and (C) sample Jurkat clonal populations, each containing a single (different) genomic integration of the WT HIV provirus. Representative Dim and Bright clonal histograms were chosen to span the range of fluorescence means. For Switching phenotypes, representative clonal histograms were chosen from the distribution clusters that were used to define a quantitative Switching criterion. GFP axis range is the same for all histograms. (D) Quantification of the WT Switching fraction based on a stratified sample of clones from the full range of GFP expression (“Full”), and based on a sub-sample of clones sorted from only the Mid region of the bulk fluorescence range (“Mid”). Error bars mark 95% confidence intervals, estimated by a bootstrap method.
Figure 2
Figure 2. A computational model of LTR transcription with Tat feedback demonstrates noise-driven Switching phenotypes with delayed activation/deactivation (A) Model schematic: The viral LTR promoter probabilistically switches between a transcriptionally inactive state and a transcriptionally active state, with rates and . In the active state, transcripts are produced with rate , and degraded at rate .
Protein translation occurs from each transcript independently at rate formula image, and each protein is degraded with rate formula image. As a model of basal transcription, all rates are assumed constant, and transcript is produced in bursts when formula image and formula image is of order 1 or greater . For the transactivation circuit, the translated protein is Tat (plus GFP), and we include a Michaelis-Menten-like dependence on Tat for the promoter activation and the transcription rates (highlighted in red in the model schematic): formula image, formula image, formula image. The parameters formula image and formula image specify fold-amplification at saturated Tat binding, and formula image specifies the saturation concentration. The model output is the predicted steady-state distribution of protein (GFP and Tat) count across a clonal population of cells, which is then converted to cytometer RFU based on previous calibration . (B) Simulated protein distributions were evolved over time from a Dim initialization (left) for representative parameter values that lead to Dim, Switching, and Bright steady-state phenotypes (right, blue curves). Simulated steady-state basal expression distributions for the same parameter values without Tat feedback are given for comparison (i.e. formula image; green curves). Simulated histograms are normalized and plotted on the same fluorescence axis as the cytometer data in Figure 1. (C) A phase diagram summarizes the expression phenotypes predicted by the Tat feedback model as basal transcription parameters (formula image and formula image) are varied over the observed experimental range of values while remaining model parameters are fixed. Drawn boundaries separate parameter combinations leading to distinct expression phenotypes. Model-predicted equilibration times (i.e., the time after which half of a Dim-initialized population crosses an intermediate expression threshold between Dim and Bright) are represented on a color scale, with longer times predicted for parameter combinations that specify Switching phenotypes. Parameter combinations used in (B) are marked with an asterisk.
Figure 3
Figure 3. A dynamic forward genetic screen selects for LTR promoter sites that increase the frequency of delayed gene expression activation and deactivation.
(A) Schematic of the genetic screen. (B–G) Jurkat cells were infected with the HIV lentiviral vector containing the WT promoter, the unselected library of promoters, or promoter libraries from each round of selection for delayed activation or deactivation. (B) Fraction of cells that showed delayed activation 5 days after sorting from the Dim gate. (C) Fraction of cells that showed delayed deactivation 5 days after sorting from the Bright gate. (D,E) Median GFP expression of the bright peak for promoter libraries selected from the (D) activation screen or (E) deactivation screen. All bar graphs are presented as the mean ± standard deviation of 3 replicates, and are representative of duplicate experiments. (F,G) Flow cytometry histograms comparing the WT initial bulk, multi-integration expression profile to the profile following four rounds of selection for (F) delayed activation or (G) delayed deactivation.
Figure 4
Figure 4. Genetic screen selects for mutations in the core LTR promoter.
(A) Approximately 90 clones were sequenced per library of promoters. (Top) Sequenced clones from the activation and deactivation screens were combined and the distribution of mutations in functional regions of the LTR was compared to the distribution of mutations throughout the entire LTR. (Bottom) The frequency of mutations was plotted for each position of the LTR for the delayed activation screen (red), the delayed inactivation screen (blue), and the unselected library (black). (B) Frequency of mutations within the core promoter region for the delayed activation screen (red) and the delayed inactivation screen (blue). Arrows indicate the top two mutations that were selected in both screens. (C) Bar graph displaying the fraction of selected LTR sequences that have mutations in Sp1 site III or the TATA box for the activation screen (red) and the deactivation screen (blue).
Figure 5
Figure 5. Selected mutations in Sp1 site III and the TATA box increase the Switching fraction.
Jurkat cells were infected with the HIV lentiviral vector containing the WT promoter, with a single point mutation in Sp1 site III (position 4), or with a single point mutation in the TATA box (position 2). (A) Relative fraction of cells that activated 5 days after sorting from the Off gate. (B) Relative fraction of cells that deactivated 5 days after sorting from the Bright gate. (C) Flow cytometry histograms comparing the WT bulk-infection profile (gray) to the profile for TATAmutP2 (left) and Sp1mutIII (right). Note the reduced weight and position of the Bright (Tat-transactivated) peak and the increased weight of the mid region. (D) Switching fractions for WT and selected mutants. Approximately 80 clones were sorted from the mid region for each infected population, and the Switching fraction was estimated as described in the main text. Error bars indicate 95% CIs, estimated by a bootstrap method. Significant differences from WT (p<0.01) indicated by (*).
Figure 6
Figure 6. Selected mutations result in small but significant differences in basal gene expression.
(A) Flow cytometry bulk-infection histograms for Jurkat cell populations. Each cell contains a single (different) integration of the Tat-null vector (sLTR-GFP-TatKO) with a WT LTR promoter (black), or an LTR with an Sp1 site III mutation (red). Uninfected Jurkat histogram is displayed for reference (gray). (B–D) Distribution noise (defined as CV2) versus mean GFP for Sp1 mutant clones sorted and expanded from the bulk populations in (A). (C–D) Clonal histograms were fit with the stochastic gene-expression model in the absence of feedback (Figure 2A), and best-fit parameters were calculated for (C) transcriptional burst size and (D) transcriptional burst frequency. Each point in B–D represents a single-integration clone from a WT (gray) or Sp1 mutant (red) infection.
Figure 7
Figure 7. Computational models exploring Switching fraction modulation by the Sp1 mutation.
(A) Model phase diagrams varying basal transcriptional parameters at fixed values of Tat feedback parameters. Drawn boundaries separate parameter combinations leading to distinct phenotypes (as in Figure 2C). Superimposed color map estimates the probability density with which the virus samples basal transcription parameters over genomic integrations for the WT promoter (left) and Sp1 mutant promoter (right). Tat feedback parameters that result in a WT Switching-fraction estimate of 12% specify the solid phenotypic boundaries (base). Decreasing the fold-amplification of Tat feedback (reduced feedback, short dashed lines) shifts phenotypic boundaries to the right, while impaired reinitiation (long dashed lines) has little effect on phenotypic boundaries. (B) Estimated Switching fractions for the sets of Tat feedback parameters used in (A), normalized by the predicted WT Switching fraction for the base set of parameters (solid line). (C) Sample Switching (grey) and Bright (black) distributions for the base set of Tat feedback parameters (solid) and for impaired reinitiation parameters (dashed). The degree of transcriptional reinitiation impairment was chosen to produce a comparable shift in Bright phenotype as the parameters for reduced feedback (A–B). The model extension to include transcriptional reinitiation was implemented by a simple rescaling of model parameters according to: formula image (rescaled basal transcription rate); formula image (rescaled amplification factor for transactivated transcription rate); formula image (rescaled feedback saturation parameter). Details may be found in Text S1.

Similar articles

See all similar articles

Cited by 15 articles

See all "Cited by" articles


    1. Spudich JL, Koshland DE (1976) Non-genetic individuality: chance in the single cell. Nature 262: 467–471. - PubMed
    1. Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) Stochastic gene expression in a single cell. Science 297: 1183–1186. - PubMed
    1. Raser JM, O'shea EK (2004) Control of stochasticity in eukaryotic gene expression. Science 304: 1811–1814. - PMC - PubMed
    1. Sigal A, Milo R, Cohen A, Geva-Zatorsky N, Klein Y, et al. (2006) Variability and memory of protein levels in human cells. Nature 444: 643–646. - PubMed
    1. McAdams HH, Arkin A (1997) Stochastic mechanisms in gene expression. Proc Natl Acad Sci USA 94: 814–819. - PMC - PubMed

Publication types


LinkOut - more resources