Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul;52(7):709-718.
doi: 10.1038/s41588-020-0645-y. Epub 2020 Jun 29.

Single-cell analysis of clonal maintenance of transcriptional and epigenetic states in cancer cells

Affiliations

Single-cell analysis of clonal maintenance of transcriptional and epigenetic states in cancer cells

Zohar Meir et al. Nat Genet. 2020 Jul.

Abstract

Propagation of clonal regulatory programs contributes to cancer development. It is poorly understood how epigenetic mechanisms interact with genetic drivers to shape this process. Here, we combine single-cell analysis of transcription and DNA methylation with a Luria-Delbrück experimental design to demonstrate the existence of clonally stable epigenetic memory in multiple types of cancer cells. Longitudinal transcriptional and genetic analysis of clonal colon cancer cell populations reveals a slowly drifting spectrum of epithelial-to-mesenchymal transcriptional identities that is seemingly independent of genetic variation. DNA methylation landscapes correlate with these identities but also reflect an independent clock-like methylation loss process. Methylation variation can be explained as an effect of global trans-acting factors in most cases. However, for a specific class of promoters-in particular, cancer-testis antigens-de-repression is correlated with and probably driven by loss of methylation in cis. This study indicates how genetic sub-clonal structure in cancer cells can be diversified by epigenetic memory.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors declare no conflict of interest.

Figures

Extended Data Fig. 1
Extended Data Fig. 1. MARS-seq in short-term clonal populations.
a, Schematic readout of transcriptional memory test using a Luria-Delbrück design. b, Distributions of the total number of UMIs obtained per clone in different cell-lines. n = number of clones profiled. c, Distributions of total UMIs obtained per cell in different cell-lines. n = number of cells profiled. d, Normalized expression variability (log2(variance/mean)) per gene in singles cells obtained by 10x (x-axis) and MARS-seq (y-axis). Genes with high normalized variance are annotated. Blue - cell-cycle markers. e, Normalized gene expression variance in HCT116 cells. Selected variable genes (black) and cell-cycle markers (blue) are annotated. Purple line is showing a roll-median trend. For both plots, cells are down-sampled to 6K UMIs. f, Normalized pooled expression of common 17,949 genes in single cells obtained by 10x (x-axis) and MARS-seq (y-axis). Expression values were computed as log2 of UMIs / 10K UMIs. Five top differential genes are annotated in red. g, Total log2 UMI counts in two MARS-seq technical replicates of 260 HCT116 well covered clonal populations (>10K UMIs in both replicates). Color of dots indicates first (black) or second (blue) culturing batches. h, Normalized expression of selected variable genes and gene-modules in technical replicates of HCT116 clones. i-j, as in g-h, for 199 H1299 clonal populations. All replicates were covered by at least 5K UMIs (random pairs of quadruple experiments are shown) k-l, As in g-h, for 157 A549 clonal populations where all replicates covered by at least 5K UMIs. m-n, As in g-h, for 57 WI38 clonal populations where all replicates are covered by at least 5K UMIs. Three randomly selected replicates were selected and summed to represent a single technical. ρ values represent Spearman’s correlations and r values represent Pearsons’s.
Extended Data Fig. 2
Extended Data Fig. 2. Identification of cell-cycle independent transcription variation of HCT116, H1299 and WI38 single cells.
a-c, Normalized pooled expression in clonal populations (x-axis) and single-cells (y-axis) in WI38 (left), H1299 (center) and HCT116 (right) cells. Expression values computed as log2 of UMIs / 10K UMIs. Genes with high differential expression in each system are displayed by red dots and annotated. n (WI38, HCT116) = 27,052, n (H1299) = 17,698. d, Distributions of total number of molecules per cell in inferred cell-cycle metacells of HCT116. Colors are as in Fig. 1b. e, log2 total expression signatures (left) and ratio of cell-cycle phases (right) in WI38 cells. Right panel shows only cells that were annotated as replicating in left panel. f, as in e for H1299 cells. g, as in e for HCT116 cells. A full list of genes used in this assay for all cell lines is in Supplementary Table 1. h, Illustration of expression randomization in each cell according to cell-cycle based cell-cell similarity graph. i, Showing scRNA gene profiles correlation with EpCAM expression, controlled by each gene total expression (left), with a running median shown in red. Following subtraction of the trendline, correlations are generally independent of gene sampling depth (right). j, Matrix of gene-gene correlations in HCT116 cells before (upper triangle) and after (lower triangle) cell-cycle based randomization. Showing selected cell-cycle related (Supplementary Table 1) and unrelated (Supplementary Table 3) gene modules. Number of analyzed cells defined in Extended Data Fig. 1c. k, Maximal correlation value of each gene with another gene before (x-axis) and after (y-axis) cell-cycle based randomization. Loss of correlation (blue dots) indicates that the co-expression patterns of this gene were independent of the cell-cycle, thus eliminated by the randomization. l-m, as in j-k, for NCI-H1299 cells. n-o, as in j-k, for WI38 cells.
Extended Data Fig. 3
Extended Data Fig. 3. Cell-cycle independent gene modules in HCT116 cells.
a-c, Spearman’s correlations (depth-adjusted) of HCT116 scRNA-seq gene profiles without (blue, red) and following (light blue, tomato) permutation. Bar graphs show top positively (left) and negatively (right) correlated genes with EpCAM, VIM and IDI1, with the respective distributions of original and permutated correlations shown at the bottom. d, Normalized expression of epithelial module genes in HCT116 clones. For each gene (row), expression is divided by maximal value observed in clones. Displaying 233 clones (columns) covered by at least 50K UMIs. e, Matrix showing clustered gene-gene correlations of all genes defined to maintain strong cell-cycle independent co-variation in Extended Data Fig. 2K (and summarized in Supplementary Table 2). Labels of genes related to epithelial module shown in d are colored in green, and its anti-correlated gene Vimentin (VIM) is colored in magenta. f, As in Fig. 1j for clones, we grouped cells obtained by MARS-seq (top) and 10x (bottom) into five bins based on expression of the EpCAM gene module (Ep5 consisting of cells with highest module expression). Bars are showing mean expression of each bin for EpCAM gene (blue) and for genes negatively enriched in EpCAM high cells (red). Error-bars represents standard error of binomial distribution. g, Distribution of normalized expression of Cholesterol (purple), Epithelial (antique-white) and EMT genes (red), binned and ordered according to the cell-cycle associated HCT116 metacells shown in Fig. 1b.
Extended Data Fig. 4
Extended Data Fig. 4. Cell-cycle independent gene modules in H1299, WI38 and A549 cells.
a, As shown in Extended Data Fig. 3e for HCT116, clustering gene-gene correlations of all H1299 cell-cycle independent genes labeled in Supplementary Fig. 3d (and summarized in Supplementary Table 2). Number of cells and clones analyzed from each cell-line are defined in Extended Data Fig. 1b,c. b, As in Extended Data Fig. 3d, showing normalized expression of ID and SERPINE1 gene modules in all H1299 single-cell derived clones. c, Comparison of NCI-H1299 gene-gene correlation over single cells (upper triangle) and clones (lower triangle). d, As in Supplementary Fig. 1d, showing for each gene the log2-ratio of relative expression in high SERPINE1 (top 25% H1299 cells and 20% H1299 clones) and low SERPINE1 (lower 30% cells and 40% clones) cells (x-axis) and clones (y-axis). Labeling genes of the ID module (red dots) and of the SERPINE1 gene module (blue dots). e, Total output (log-normalized expression) of SERPINE1 gene module (x-axis) and ID gene module (y-axis) in cells and clones shows a bi-modal, clonally conserved population structure in the NCI-H1299 system. f, As in c, Comparison of WI38 gene-gene correlation over single cells (upper triangle) and clones (lower triangle). g, As in a, clustering gene-gene correlations of all WI38 cell-cycle independent genes labeled in Extended Data Fig. 2o. Showing black labels for collagen module genes. h, As in b, showing normalized expression of Collagen module genes in WI38 clones. i, Gene-gene correlation of most variable genes in A549 clones. Labels of selected gene are shown on right. j, As in Extended Data Fig. 3d, showing normalized expression of variable genes in A549 clones.
Extended Data Fig. 5
Extended Data Fig. 5. Longitudinal whole exome sequencing (WES) analysis of selected clonal populations.
a, Coverage Summary of 27 Whole Exome Sequencing (WES, see Methods) experiments. Total reads obtained per sample (orange) and median on-target coverage per base (blue) are shown. Other stats and WES quality control are available in Supplementary Tables 12,13. b, Fraction of SNPs detected per coverage bin in different cell lines (mutational burden). Calls from all clones were aggregated per cell line. Coverage per base was obtained by DepthOfCoverage module of GATK v3.5. c, Allele frequency (AF) distributions of detected variants in HCT116 clones sampled after 78 days (top) and 168 days (bottom). d, Spatial distribution of SNPs detected in HCT116 clones. e, Comparison of allele frequencies in five HCT116 single-cell derived clones after 78 days (x axis) and 168 days (y axis). If selection was greatly affecting the process, allele frequencies were not expected to remain stable as observed in practice. f, Expression of marker genes in six A549 clones that were selected for exome analysis (colored in brown). g, Similar to Fig. 2c, kinship analysis of A549 clones. Rows above column show normalized expression of KRT18 (red) and SERPINE1 (blue) genes in each clone. h, Selection of seven NCI-H1299 clones (colored in magenta). i, Kinship analysis as in Fig. 2c for NCI-H1299 clones. j, Normalized expression of the SERPINE1 and ID modules in H1299. Single cells represented by small grey dots. Clones profiled by WES are labeled in black and red (as in panel i). Note the concordance between the genetic and transcriptional sub-clonal structure for these cells.
Extended Data Fig. 6
Extended Data Fig. 6. scPBAT and PBAT-capture of HCT116 clones using a colon cancer oriented probe-set.
a, Distribution of methylation calls in low-depth HCT116 single-cell PBAT analysis. b, Whole genome coverage and on-target coverage for HCT116 clonal populations. Coverage = total number of methylation calls. c, Density plot of pooled average methylation of on-target regions in single cells (black line) and clonal populations (antique-white line). d, Distributions of pooled averaged methylation of on-target regions in clones, grouped by their respective pooled average methylation in single cells (regions with > 50 calls in single cells and > 500 calls in clones, n = 341, 69, 49, 49, 39, 32, 67, 77, 186, 595). e, Pooled average methylation of individual cells in very low (0-1%) and low (1-2%) CG-content regions. f, Trends showing the correspondence between DNA methylation and CpG content for 1,022 single cells. g, Similar to f, showing correlation with genomic replication time. h, Single cell methylation average in regions with low CG-content (0-3%), defining classification into low, mid and high-bg cells i, Distribution of the log2 ratio of coverage of genomic sequences in early- and late-replicating regions. Vertical dashed grey line is defining the threshold for classifying single cells into G1 and S phase. j, Distribution of average methylation per cell in genome-wide low CpG regions (0-3%) for cells inferred to be in G1 and S phases in panel i. nG1 = 254, nS = 767. k, Average promoter and enhancer (Methods) methylation in groups of single cells For all groups, n > 6000, chi-squared test, in all cases P < 2*10-16. l, Genomic spatial distribution of colon PBAT-capture probe set. m, Number of regions covered by the probe set, stratified by genomic context. n, Distribution of methylation of covered regions in TCGA colon cancer (COAD). Shown is average methylation of CpGs that reside within (blue) and outside (orange) the PBAT-capture probe set, grouped by genomic context (for all comparisons n > 6000, two-tailed KS test: D > 0.11, P < 2*10-16). o-q, Average methylation of 293 TCGA colon cancer tumors (COAD), in different ranges of CpG content.
Extended Data Fig. 7
Extended Data Fig. 7. Clonal methylation at functional regions is association with epithelial transcriptional output in HCT116.
a, Example for selection of clones for KNN-based normalization of DNA methylation over the clonal HCG (y axis) and LCG (x axis) space. Red dots indicate the K = 25 nearest neighboring clones used to normalize methylation of the selected clone (shown as blue dot). b, Distribution of correlations between average methylation of capture regions in clones to average methylation of clones in Low-CG (LCG) loci before (grey) and after (black) KNN normalization. c, same as b for High-CG (HCG) loci. d, Clustering of Spearman’s cross-correlation between gene expression and normalized average methylation of capture regions over 251 HCT116 clones covered simultaneously by RNA-seq and PBAT-capture. Green annotation of genes indicates epithelial genes. e, Epithelial transcriptional output per clone (x-axis) and clonal average methylation (y-axis) in 73 capture regions defined in d as Epithelial regions (Ep. regions, top) and 62 capture regions defined in d as EMT related regions (bottom) in 155 HCT116 clones, covered by at least 50K UMIs and 50K on-target methylation calls. f, As in e, showing expression of EMT related gene Zinc finger E-box-binding homeobox 1 (ZEB1) and methylation in Epithelial (top) and EMT (bottom) associated capture regions defined in d. g, Pooled average methylation of enhancer CpGs in EpCAM-high and -low clones, highlighting enhancers of epithelial up- (blue) or down-regulated (red) genes.
Extended Data Fig. 8
Extended Data Fig. 8. GEMINIs rationale and cell-cycle modelling of DKO HCT116 cells.
a, Average methylation of EpCAM-high (blue, n = 51) and EpCAM-low (light-blue, n = 51) clones over a region of chromosome 20 (Top panel - 5kb bins, lower magnification: single CpGs). Green dots mark “cold” CpGs as defined in Fig. 3l. b, Bars indicate pooled expression levels in EpCAM-high and -low clones for genes within the genomic region shown in a. Chi-squared P values: TRIB3, RBCK1 < 2*10-16, SOX12 = 7*10-6, TBC1D20 = 2*10-5, C20orf54 = 1.3*10-3, CSNK2A1 = 3*10-3. c, Distribution of deviation from persistency (blue trend in Fig. 3l, see Methods) of enhancer CpGs. Ep-high and Ep-low represent CpGs with differential methylation of 0.1 or higher in EpCAM high vs. low clones. nother = 767, nEp.high = 122, nEp.low = 152. Two-tailed KS test, Ep-high: D = 0.2, P = 4*10-4. Ep-low: D = 0.16, P = 3*10-3. d, Showing inter-clonal variance (Fig 3l) for enhancer CpGs colored as in c. Green - epithelial-related CpGs in chr20 as in panel a. e, Schematics of the screen for GEMINIs. f, Bars indicate clones’ LCG average methylation, color-coded by the number of GEMINIs de-repressed in it. Two-tailed KS test (D = 0.22, P = 0.039), comparing LCG average methylation for clones with and without GEMINIs (excluding VIM-high clones). g, Coverage depth of DKO transcriptome. h, Statistics of single-cell-PBAT methylation profiles of 974 DKO cells (orange boxes) and 1,022 WT cells (blue boxes). i, Pooled average methylation in WT (blue line) and DKO (orange line) cells, as a function of genomic CpG content. j, Distribution of pooled methylation of DKO and WT HCT116 cells (x-axis), showing 1,988 CpGs with n > 8 calls in both pools. k, Normalized pooled expression (log2 UMI / 10K Umis) in DKO clonal populations (x-axis) and DKO single-cells (y-axis). Genes with highest differential expression are highlighted. l-m, Reproducibility of technical replicates in MARS-seq for 203 DKO clones, showing total UMI counts (log2 transformed) in two MARS-seq technical replicates and normalized expression of selected variable genes and gene-modules. Rho (ρ) represents Spearman’s correlation coefficient and r represents Pearson’s. n-p, Cell-cycle analysis of 3,371 single DKO cells, as shown in Fig. 1b and Extended Data Fig. 2d for wild-type HCT116.
Extended Data Fig. 9
Extended Data Fig. 9. Identification of cell-cycle independent transcriptional variance in DKO HCT116 single cells.
a, Maximal correlation values of each gene with another gene before (x-axis) and after (y-axis) cell-cycle based randomization of DKO cells (blue dots indicate genes maintaining cell-cycle variance, for full list see Supplementary Table 2). b, Matrix of gene-gene Spearman’s correlations in DKO cells before (upper triangle) and after (lower triangle) cell-cycle based randomization. c, Distribution of gene module expression per cell, classified by cell-cycle associated metacells in DKO (as defined in Extended Data Fig. 8n-p). d, Matrix of gene-gene Spearman’s correlations in DKO single cells (upper triangle) and DKO clones (lower triangle), indicating cell-cycle independent gene modules summarized in Supplementary Table 3. e, Genes with highest (blue bars) and lowest (red bars) expression change between EpCAM high and low DKO clones. f, Comparing gene expression fold-change of EpCAM high and low clones in HCT116 WT (x-axis) and DKO (y-axis).
Extended Data Fig. 10
Extended Data Fig. 10. Screening for in-TAD transcriptional memory in HCT116, A549 and WI38 cells.
a, Schematics of the screen for TAD de-repression. Clones can maintain deterministic repression of transcription in TADs that are “closed”. De-repression of a TAD in a clone can result in stochastic (possibly uncorrelated) de-repression of genes within it. b, Distribution of contact distances for 488M HCT116 Hi-C contacts. C. TADs are defined between two picks of insulation (y-axis), as exemplified here for a segment of chromosome 7. d, Showing log mean expression in HCT116 clones (x-axis) and TAD auto-correlation scores (y-axis, see Methods). Genes showing statistically significant (positive) auto-correlation are labeled (light-blue for FDR < 0.25 and dark-blue for FDR < 0.05), for full list see Supplementary Table 9). e, We computed the correlation of expression between all genes to all TADs, and for each gene we measured the rank of its TAD auto-correlation. Shown is the distribution of these TAD auto-correlation ranks (value of 1 means the gene’s own TAD was the most correlated to it). f, Cumulative distribution of TAD auto-correlations in HCT16 clones, for observed data (black line) and for shuffled data (randomly assigning genes to TADs, grey line). g-i, Same screen as in d-f for A549 clones. j-l, Same screen as in d-f for WI38 clones. m, Showing fold-change expression of genes in HB-high vs. HB-low HCT116 clones (y-axis), over expression in HB-low clones (x-axis, left) fold-change in HB-high single cells vs. HB-low cells (x-axis, right). n, Expression across HCT116 clones of selected genes that correlate with expression of the HB gene module (x-axis), compared to expression from genes in the beta-globin chromosomal domains (y-axis).
Figure 1
Figure 1. A Luria-Delbrück design for testing transcriptional and epigenetic memory.
a, Schematic overview of our experimental design. Green dashed arrows: culturing steps. Black dashed arrows: sorting of single cells into conditioned media. Non-dashed arrows/lines: processing steps. WES: whole-exome sequencing. b, Left: Expression of selected genes over 3,255 HCT116 single cells (columns) grouped into metacells (top labels) according to similarity in cell-cycle gene expression. Center: 2D Projection of the cell-cycle metacell model. Metacells)large ovals) are color coded according to the expression intensity of the cell-cycle marker MKi67, cells are shown as small gray dots. Right: Comparing expression of M-phase and S-phase genes (Supplementary Table 1) for cells and metacells. c, As b, for 3,584 NCI-H1299 single cells. d, As in b, for 1,172 WI38 single cells. e, Normalized expression (UMI per 100k UMIs) distribution in HCT116 cells and clones of epithelial (EpCAM) and S-phase gene modules (as detailed in Supplementary Table 1 and Supplementary Table 3). f, As e, for ID module and M-phase in NCI-H1299 cells. g, As e, for Collagen module and M-phase in WI38 cells. h, Distribution of VIM expression (log2 of UMI per 10k UMIs) in HCT116 single cells (left) and clones (right). i, log2 expression fold changes for genes enriched in EpCAM high clones (blue, top 30% of clones, n = 51) and EpCAM low clones (red, lowest 30% of clones, n = 51), after exclusion of 11 VIM-high clones. For all genes shown here, FDR corrected q-value « 0.001, chi-squared test. j, Shown is the total UMIs for genes positively (blue) and negatively (red) enriched in EpCAM-high clones, in clones grouped based on expression of the EpCAM gene module. k, Density plots of expression in EpCAM-high (black line), EpCAM-low (red line) and 11 VIM-high clones (orange line) for selected genes.
Figure 2
Figure 2. Long-term clonal maintenance of Epithelial and VIM-high transcriptional states.
a, Expression (log2 count per 10k UMI), of VIM (left panel) and the EpCAM module (right panel) for clonal populations that were sampled twice, after 10 and 18 days (> 15 k UMI per sample in both). b, Expression (note linear scale) of VIM and the EpCAM module in six clones selected for further longitudinal analysis. Dashed horizontal line represents median expression over all clones sampled after 10 (grey, n = 59) and 18 (dark grey, n = 59) days. c, Analysis of genetic kinship between HCT116 clones. Text in each cell shows the absolute count of shared SNPs between two clones. Upper bars show VIM (red) and EpCAM gene module (blue) expression (pooled single cells RNA in the closest time point). d, Metacell 2D projection for single-cell RNA-seq data from longitudinal analysis of six clones. Colors represent the level of EpCAM expression. e, Average expression of VIM (upper panel) and epithelial genes (lower panel) in tracked clones over six time-points (clonal RNA-seq at day = 10 and 18; pool of single-cell RNA-seq at day = 33, 62, 98 and 148). See panel g below for the number of cells at each time-point. Error-bars represent SE of binomial sampling, based on total sampled UMIs per clone per time-point. f-h, For each clone (row), showing total epithelial and VIM transcriptional output per cell by time-point (f); 2D projection of clones’ single cells, coloring according to the EpCAM module expression intensity (g) and the changes in VIM and the EpCAM gene module expression distributions over time (h).
Figure 3
Figure 3. Hot and cold dynamics of clonal methylation.
a, Distribution of clonal average methylation in low CpG content (LCG, 0-3% CG content) loci (observed vs. shuffled control, excluding VIM-high clones as defined in Fig. 1h). b, VIM expression by clonal LCG methylation. VIM-high clones are colored in red. c, Distribution of Spearman correlations between LCG methylation and gene expression over all genes. Controls are based on shuffling clonal LCG values. d, Clonal LCG methylation in early- and late-replicating genomic domains. e, Distributions of early and late replicating loci methylation over clones, indicating by s the standard deviations. f, As in a, for high CpG content (HCG, 7-15% CG content) loci. g, clonal LCG vs HCG average methylation, indicating lack of correlation. h, As in c, for HCG methylation. i, As in d, for HCG methylation (but excluding loci with overall average methylation higher than 0.3). j, As in e, for HCG methylation. k, We simulated two alternative methylation dynamics in clonal population assuming zero memory (left, mixture model) and perfect memory (right, persistency model). l, Shown are inter-clonal methylation variance vs. average methylation across well covered loci (running median is defined by a gray curve). Blue and orange lines depict the variance predicted by the persistency and mixture models (see Methods). Red and Blue dots mark partially methylated loci showing empirically “hot” (high turnover, mixture model) and “cold” (low turnover, persistency model) dynamics, respectively. m, We grouped 192 clones with sufficient coverage by their LCG (left) or HCG (right) methylation (minimal group size = 54, excluding VIM-high clones). Boxplots depict distribution of average methylation in hot and cold loci across the groups. In all boxplots throughout this manuscript we used R version 3.5.3 defaults for boxplot() function – where middle line indicates median, box limits are quantiles, and whiskers are 1.5 × IQR. Kolmogorov-Smirnov two-tailed test, LCG-high to LCG-mid: D = 0.39, P = 3 × 10-5. LCG-low to LCG-mid: D = 0.49, P = 9 × 10-7. HCG-high to HCG-mid, D = 0.31, P = 3 × 10-3. HCG-low to HCG-mid: D = 0.29, P = 0.01. HCG-high to HCG-low: D = 0.39, P = 1 × 10-4. n, Pooled average methylation of promoter CpGs in EpCAM-high (n = 51) and –low (n = 51) clones, highlighting promoters with up-regulated (blue) or down-regulated (red) expression in EpCAM-high clones (D = 0.3, P = 8 × 10-5, KS two-tailed, nblue = 257, nred = 88). o, Promoter methylation in clones showing high (n = 51) and low (n = 51) EpCAM expression. Bars showing average expression and error-bars represent SE of binomial sampling. Chi-squared test, all P values < 2 × 10-15. Panels below bars indicate chromosomal coordinates and show average methylation of covered CpGs in EpCAM-high (blue dots) and -low (light blue dots).
Figure 4
Figure 4. Screening for Genes that Escaped Mitotically Inherited Inhibition (GEMINIs).
a, 98 GEMINIs were selected based on genes with low basal expression (less than 1 UMI / 10k UMIs) and high maximum de-repression (red dots, see Methods). For each gene, showing averaged basal expression across HCT116 clones (x-axis, see Methods), and GEMINI score indicating rare de-repression in few clones (y-axis, see Methods). b, Density plot of overall GEMINI pooled expression per cell (red line) and clone (black line). c, Distributions of normalized variances in single cells (log2 of variance-to-mean ratio) for 98 GEMINIs and 969 randomized matched controls with similar expression levels and promoter CG content. Two-tailed KS test, D = 0.21, P = 9 × 10-4. d, Distributions of genomic features of GEMINIs and randomized controls matched for expression levels. High-exp gene: TSS within top 20 expression percentiles. Two-tailed KS test, from left to right: CG (D = 0.13; P = 8 × 10-3, compared to matched-controls of expression only), Distance (0.22; 8 × 10-8), Methylation (0.35; 2 × 10-6). e, We annotated each clone with a “repressed”, or “de-repressed” state regarding each one of the GEMINIS (see Methods). Bars showing average methylation of pooled GEMINI promoters in their repressed (black bar, n = 20,797) and de-repressed clones (grey bar, n = 201). Error-bars represent sampling SE, based on total methylation calls in de-repressed or repressed clones. Chi-squared test, χ = 51, P = 9 × 10-13. f, Average methylation of selected GEMINIs in their de-repressed clone. g, Distribution of gene expression correlation to clonal LCG (blue boxplots, as in Fig. 3c), and HCG (red boxplot, as in Fig. 3h) for GEMINIs and for matched controls. Two-sided KS test, LCG (D = 0.19; P = 5 × 10-3), HCG (D = 0.16; P = 0.035). h, Left panel: GEMINIs (rows) expression in 300 HCT116 WT clones (columns). Expression levels are normalized by maximal expression value of each of the 98 GEMINIs. Order of columns is determined by clonal LCG average methylation. Right panel: Expression of GEMINIs in DKO clones. Expression is normalized by the maximal expression of each GEMINI in its de-repressed WT clone (*** FDR corrected q-value < 0.001, KS test for comparison of GEMINIs expression in HCT116 WT and HCT116 DKO clones). i, Cumulative distribution for the fraction of repressed DKO clones for GEMINIs (red line) and for the matched randomized subset of control genes with similar CG-content and expression levels in WT (black line). De-repression threshold was set to half of the maximal normalized expression in WT clone. Two-sided KS test, D = 0.2, P = 2 × 10-3. j, Comparison of de-repression in DKO for methylated and unmethylated genes. Showing data only on genes that were considered repressed in HCT116 WT (< 0.25 UMIs / 10k UMIs in at least 80% of the clones). Pooled methylation data on clones used to define methylated (> 0.9, n = 365) and unmethylated (< 0.9, n = 4,160) promoters. Chi-squared test: GEMINI/meth: χ . = 5.9; P = 0.015; GEMINIS/unmeth (32; 2 × 10-8); meth/unmeth (8.1; 4 × 10-3). k, Cumulative distribution of genes showing rare de-repression (see Methods for definition of GEMINI score) in 399 colon adenocarcinoma RNA-seq samples (obtained from TCGA). GEMINI score compared expression in maximal tumor to 95’th expression quantile. Showing only genes that were generally repressed in both HCT116 clones and TCGA samples. CTAs represent 77 annotated Cancer-Testis Antigens. Two-tailed KS test, Geminis (D = 0.28, P = 5 × 10-5), CTA (0.42; 2 × 10-9).
Figure 5
Figure 5. Evidence for in-TAD memory.
a, Normalized expression of genes spanning the beta-globin TAD on chromosome 11. Heatmap is showing gene (rows) expression over clones (columns), normalized to the gene’s median expression across clones. Top bars annotate clones as HB-high and -low, (VIM-high clones are excluded). Spatial map using one letter encoding (left of heatmap) is shown at the bottom, also indicating TAD borders as grey dashed vertical lines. b, Pooled average expression in HBE1-high (n = 76, blue) and HBE1-low (n = 77, black) clones defined in a. Error bars represent SE of sampling a binomial distribution. Chi-squared P values: HBG1,OR51B5 < 2 × 10-16, OR51B4 = 5 × 10-12, OR51B2 = 0.018, OR51I1 = 9 × 10-8. c, Expression of embryonic (HBE1) and fetal (HBG2, HBG1) beta-globin genes (x-axis) vs. expression of adjacent cluster of olfactory receptors (y-axis). Dots represent clones and Spearman correlation is indicated here and in other figure panels. n = 168 clones covered by >100k UMIs. d, Expression of the HB (x-axis) and EpCAM (y-axis) modules, indicating lack of correlation. e, Temporal change in HB module expression for six clones, similar to Figure 2e. Error bars represent sampling SE, based on total sampled UMIs per clone per time-point. f, As in a, for KRTAP region on chromosome 17. Spatial map of genes is drawn on top, and the SHAMAN-normalized contact frequency map is depicted as a triangle respective to the region’s linear coordinates (HCT116 Hi-C data obtained from Rao et al. 2017, see Methods). g, As in b, Bars indicate average pooled expression in LOC730755-high (brown, n = 50) and LOC730755-low (orange n = 55) clones defined in f. Error bars represent SE of binomial sampling. Chi-squared P values: KRTAP3-1, KRTAP3-2 < 2 × 10-16, KRTAP2-2 = 0.03, KRTAP4-12 = 3 × 10-16 . h, Comparing total clonal expression of four genes in the KRTAP2 sub-cluster (x-axis) and 8 genes in the KRTAP3, 4 sub-cluster (y-axis). Sample size as in c. i, Comparing clonal expression of KRTAP-associated genes (x-axis) and the EpCAM module (y-axis). j, Similar to e, for KRTAP-associated genes. k, Distribution of HB region expression (genes E-L in a) in clones and single cells. l, Distribution of KRTAP region expression (genes E-O in f) in clones and single cells. Vertical dashed lines in k-l indicate the overall normalized expression in clones (black) and cells (red).

Similar articles

Cited by

References

    1. Hanahan D, Weinberg RA. Hallmarks of Cancer: The Next Generation. Cell. 2011;144:646–674. - PubMed
    1. Zwemer LM, et al. Autosomal monoallelic expression in the mouse. Genome Biology. 2012;13:R10. - PMC - PubMed
    1. Iberg-Badeaux A, et al. A Transcription Factor Pulse Can Prime Chromatin for Heritable Transcriptional Memory. Mol Cell Biol. 2017;37 - PMC - PubMed
    1. Shaffer SM, et al. Memory sequencing reveals heritable single cell gene expression programs associated with distinct cellular behaviors. 2018 doi: 10.1101/379016. - DOI - PMC - PubMed
    1. Vardi N, Levy S, Assaf M, Carmi M, Barkai N. Budding yeast escape commitment to the phosphate starvation program using gene expression noise. Curr Biol. 2013;23:2051–2057. - PubMed

Publication types