Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 10:7:e35788.
doi: 10.7554/eLife.35788.

A promoter interaction map for cardiovascular disease genetics

Affiliations

A promoter interaction map for cardiovascular disease genetics

Lindsey E Montefiori et al. Elife. .

Abstract

Over 500 genetic loci have been associated with risk of cardiovascular diseases (CVDs); however, most loci are located in gene-distal non-coding regions and their target genes are not known. Here, we generated high-resolution promoter capture Hi-C (PCHi-C) maps in human induced pluripotent stem cells (iPSCs) and iPSC-derived cardiomyocytes (CMs) to provide a resource for identifying and prioritizing the functional targets of CVD associations. We validate these maps by demonstrating that promoters preferentially contact distal sequences enriched for tissue-specific transcription factor motifs and are enriched for chromatin marks that correlate with dynamic changes in gene expression. Using the CM PCHi-C map, we linked 1999 CVD-associated SNPs to 347 target genes. Remarkably, more than 90% of SNP-target gene interactions did not involve the nearest gene, while 40% of SNPs interacted with at least two genes, demonstrating the importance of considering long-range chromatin interactions when interpreting functional targets of disease loci.

Keywords: GWAS; capture Hi-C; cardiomyocytes; cardiovascular disease; chromosomes; gene expression; gene regulation; human; human biology; medicine.

PubMed Disclaimer

Conflict of interest statement

LM, DS, NS, IA, AJ, GH, GB, IM, EM, MN No competing interests declared

Figures

Figure 1.
Figure 1.. General features of promoter interactions.
(A) Venn diagram displaying the number of cell-type-specific and shared promoter interactions in each cell type. (B) Proportion of interactions in each distance category: promoter (P)-promoter (both interacting ends overlap a transcription start site (TSS)); P-proximal (non-promoter end overlaps captured region but not the TSS); P-distal (non-promoter end is outside of captured region). Note that all promoter interactions are separated by at least 10 kb. (C) Distribution of the distances spanning each interaction in iPSCs and CMs. The red line depicts the median (170 kb in iPSCs, 164 kb in CMs); the black line depicts the mean (208 kb in iPSCs, 206 kb in CMs). (D) A ~ 2 Mb region of chromosome 8 encompassing the GATA4 gene is shown along with pre-capture (whole genome) Hi-C interaction maps at 40 kb resolution for iPSCs (top) and CMs (bottom). TADs called with TopDom are shown as colored bars (median TAD size = 640 kb in both cell types, mean TAD size = 742 kb in iPSCs and 743 kb in CMs) and significant PCHi-C interactions as colored arcs. (E) Zoomed-in view of the GATA4 locus (promoter highlighted in yellow) in iPSCs (top) and CMs (bottom) along with corresponding RNA-seq data generated as part of this study, and ChIP-seq data for H3K27ac, H3K4me1, H3K27me3 and CTCF from the Epigenome Roadmap Project/ENCODE (H1 and left ventricle for iPSC and CM, respectively). Filtered GATA4 read counts used by CHiCAGO are displayed in blue with the corresponding significant interactions shown as arcs. For clarity, only GATA4 interactions are shown. Gray highlighted regions show interactions overlapping in vivo validated heart enhancers (pink boxes), with representative E11.5 embryos for each enhancer element (Visel et al., 2007). Red arrowhead points to the heart.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Quality control of iPSC-CMs.
(A) Flow cytometry of iPSC-derived cardiomyocytes. Representative image of flow data for cardiomyocytes (left) and percent cardiac troponin T (cTnT) positive for each differentiation (right). Cells were first gated on live/dead and then on cTnT staining. (B) Principle component analysis of RNA-seq data in iPSCs and CMs along with H1 embryonic stem cells, left ventricular cells (LV), fetal heart cells (FH), and lymphoblastoid cell line cells (LCL). LCLs cluster independently from iPSC and CM, indicating that iPSCs were faithfully reprogrammed. (C) Percentage of Epigenome Roadmap H3K27ac ChIP-seq peaks overlapping iPSC and CM H3K27ac peaks. Overlaps for all peaks and only non-promoter peaks are shown. LV, left ventricle; H1, H1 embryonic stem cell line. (D) Three genome browser snap-shots displaying the epigenetic landscape in CMs compared to left ventricle, right atria, adult liver and brain hippocampus from the Epigenome Roadmap.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Analysis of RNA-seq in iPSCs and iPSC-CMs.
(A) Cluster analysis of RNA-seq data from each triplicate of iPSC and CM. (B) Number of genes differentially expressed in each cell type. (C) Selected genes overexpressed in CMs relative to iPSCs. (D) Gene Ontology enrichment analysis of the biological processes associated with the 4802 genes overexpressed in cardiomyocytes.
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Analysis of PCHi-C interactions in the context of TADs.
In this analysis, interactions were classified as intra-TAD (both ends of the interaction fully within a single TAD) or inter-TAD (each end of the interaction is in a different TAD). Interactions falling partially or wholly within TAD ‘boundaries’ or ‘gaps’ as defined by TopDom were omitted (see Materials and methods). (A) Proportion of interactions that are intra-TAD at different cut-offs. All analyses used interactions that were 100% within a TAD. (B) Proportion of promoter-promoter interactions in the set of intra-TAD and inter-TAD interactions. (C,D) Fold enrichment for intra-TAD and inter-TAD interactions to overlap CTCF (C) or H3K27ac peaks (D). Only promoter-distal ChIP-seq peaks were analyzed. ***p<2.2 × 10−16, Z-test. (E) CHiCAGO score and (F) interaction span of intra- vs. inter-TAD interactions. ***p<2.2 × 10−16, Wilcoxon rank-sum test. (G,H) Considering promoters with an intra-TAD interaction, an inter-TAD interaction, or exclusively intra-TAD or inter-TAD interactions: (G) distance from the promoter TSS to the nearest TAD boundary and (H) average TPM value of the promoter. ***p<2.2 × 10−16, **p<0.01, *p<0.05, NS = not significant, Wilcoxon rank-sum test.
Figure 2.
Figure 2.. Transcription factor motif enrichment in distal interacting regions.
(A,B) Selected transcription factor (TF) motifs identified using HOMER in the promoter-distal interacting sequences for all over-expressed genes in (A) iPSCs and (B) CMs (fold change > 1.5, Padj < 0.05). ‘% sites’ refers to the percent of distal interactions overlapping the motif; rank is based on p-value significance. (C) To compare motif ranks across gene sets, the inverse of the rank is plotted for selected motifs identified in distal interactions from over- or under-expressed genes in both iPSCs and CMs. (D) The top 50 motifs identified in cell-type-specific interactions. OSN, OCT4-SOX2-TCF-NANOG motif.
Figure 3.
Figure 3.. Enrichment of promoter interactions to distal regulatory features.
(A,B) Proportion of promoter-distal interactions overlapping a histone ChIP-seq peak compared to random control MboI fragments (see Materials and methods). iPSC interactions were overlapped with H1 ESC ChIP-seq data; CM interactions were overlapped with left ventricle ChIP-seq data from the Epigenome Roadmap Project (Supplementary file 10). (C) Fold enrichment of the data presented in (A) and (B). (D) Fold enrichment of promoter-distal interactions based on the expression level of the promoter. Promoters were grouped into five bins according to their average TPM values. Dashed line indicates no enrichment. (E) Fold enrichment of cell-type-specific and shared interactions (columns) to tissue-specific and shared chromatin features (rows). (F) Example of the NPPA gene in iPSCs (top) and CMs (bottom). Gray box highlights CM-specific interactions to CM-specific chromatin marks and an in vivo heart enhancer (Visel et al., 2007). For clarity, only interactions for NPPA are shown. *p<0.00001, #p=0.0017, Z-test.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Correlation between the number of histone ChIP-seq peaks within 300 kb of promoters and gene expression level.
Number of promoter-distal histone ChIP-seq peaks within 300 kb of promoters in iPSC (A) and CM (B). Spearman’s rho (ρ) was calculated on the full set of promoter expression values/peak counts for all promoters with at least one significant interaction in the respective cell type (12,926 genes for iPSC and 13,555 genes for CM; see Materials and methods). Data are grouped by expression category to emphasize the trend. Horizontal bars indicate the median for each expression category. All correlation estimates are significant at p<2.2 × 10−16 except for H3K27me3 in iPSCs (p=0.06).
Figure 4.
Figure 4.. A/B compartment switching corresponds to activation of tissue-specific genes.
(A) Top panel: 10 Mb region on chromosome four showing A (green) and B (blue) compartments based on the first principle component analysis calculated by HOMER (Heinz et al., 2010) of the whole-genome Hi-C and capture Hi-C interaction data. Bottom panel: zoomed in on the CAMK2D locus; only capture Hi-C A/B compartments shown. Inset: expression level of CAMK2D in iPSCs and CMs across the three replicates. (B) Expression level (TPM) of genes located in the A (green) or B (blue) compartment in each replicate of iPSC (left) or CM (right). (C) Difference in expression level (log2 fold change relative to iPSCs) of genes switching compartments from iPSC to CM or remaining in stable compartments. (D) Gene Ontology analysis of biological processes associated with genes switching from B to A compartments during iPSC-CM differentiation. ***p<2.2 × 10−16, Wilcoxon rank-sum test.
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Comparison of A/B compartments in Hi-C and PCHi-C.
Correlation between the A/B compartment score (principle component analysis of interaction data, PC-1) in whole-genome Hi-C (y-axis) and promoter capture Hi-C (x-axis) in iPSCs (top) and CMs (bottom). Spearman’s ρ > 0.98, p<2.2 × 10−16 in all cases.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Example of A/B compartments.
Genome browser snapshot of a ~53 Mb region on chromosome four showing A/B compartments in all three replicates of iPSCs and CMs using both whole-genome (WG) and promoter capture Hi-C data.
Figure 4—figure supplement 3.
Figure 4—figure supplement 3.. GO analysis on the genes switching from active A compartments in iPSCs to inactive B compartments in CMs.
Figure 5.
Figure 5.. CM promoter interactions link CVD GWAS SNPs to target genes.
(A) Distribution of genomic distances separating SNP-target gene interactions (red line, median = 185 kb; black line, mean = 197 kb). (B) Pie chart showing the number of TSS’s skipped for each SNP-target gene interaction (left) and the number of genes contacted by each SNP (right). (C) GO enrichment analysis for genes looping to LD SNPs using the CM promoter interaction data (left panel) or the iPSC promoter interaction data (right panel). (D) Proportion of target genes that result in a cardiovascular phenotype when knocked-out in the mouse (MGI database [Blake et al., 2017]), compared to a random control set. p-Value calculated with a Z-test. (E) Proportion of GWAS LD SNPs that are eQTLs in left ventricle (LV) when considering either the full set of LD SNPs, or the subset that overlap CM promoter interactions. p-Value calculated with Fisher’s exact test. (F) Proportion of LV eQTLs (genome-wide) that map within a promoter interaction for the eQTL-associated gene (indicated by the red line). Random permutations were obtained by re-assigning each promoter’s set of interactions to a new promoter and calculating the proportion of eQTLs in random interactions that interact with their eQTL-associated gene. Proportions only consider eQTLs that overlap a promoter-distal interaction. P-values calculated with a Z-test.
Figure 6.
Figure 6.. Characterizing target genes based on expression level.
(A) Log2 fold change of the expression level of target genes in CMs compared to iPSCs (horizontal bar indicates median, 1.08; diamond indicates mean, 1.44). (B) Average TPM values of target genes in iPSCs and CMs (p=0.12, Wilcoxon rank-sum test). Diamonds indicate the mean value (40.6 for iPSC, 60.1 for CM). (C) Comparison of average TPM values for target genes in CMs and iPSCs. See Supplementary file 8 for full list of genes and TPM values. (D,E) Examples of genes looping to cardiac arrhythmia GWAS SNPs in CMs. (D) The TBX5 gene interacts with a functionally validated arrhythmia locus (Smemo et al., 2012). (E) The LITAF gene interacts with a locus identified in (Arking et al., 2014). Yellow highlighted region indicates the promoter; gray box and zoom panel show the promoter-interacting regions (pink boxes) overlapping arrhythmia SNPs. For clarity, only interactions for the indicated promoter are shown.
Figure 7.
Figure 7.. Relevance of CM promoter interactions for cardiac arrhythmia, myocardial infarction and heart failure.
(A–C) Gene Ontology analysis for target genes looping to (A) cardiac arrhythmia SNPs, (B) myocardial infarction SNPs, and (C) heart failure SNPs. (D) The SORT1 promoter loops to a distal myocardial infarction locus (Musunuru et al., 2010). The rs12740374 SNP shown to disrupt a C/EBP binding site in (Musunuru et al., 2010) is colored red. (E) The ACTA2 promoter loops to the 10q21 heart failure locus (Smith et al., 2010). Zoom plots depict the full interacting region overlapping GWAS LD SNPs. For clarity, only interactions for the indicated gene are shown.
Author response image 1.
Author response image 1.. Cis-regulation within TADs.
Genome browser snapshot of the IRX5 locus in iPSCs (top) and CMs (bottom). Yellow highlighted region is the IRX5 promoter. CM-specific interactions to a Vista heart enhancer and H3K27ac peaks are highlighted in gray. Note the relatively invariant TAD structure over this region, compared to the dynamic within-TAD IRX5 promoter interactions between the two cell types (black arrowheads).
Author response image 2.
Author response image 2.. Correlation of expression with number of enhancer contacts.
(A) Genes were grouped into 5 categories according to expression levels (q0=TPM 0, q1=TPM 0-3, q2=TPM 3-25, q3=TPM 25-150, q4=TPM>150) and the number of promoter-distal H3K27ac ChIP-seq peaks contacted by each promoter is displayed. The blue vertical bar indicates the median. (B,C) The median number of H3K27ac peaks contacted by promoters in each expression group is plotted against the expression group value for iPSC (B) and CM (C). Only promoter-distal interactions were considered. Spearman’s rho values are shown for the correlation estimate between expression and number of enhancers contacted. The same correlations were obtained when grouping genes by hard quantile cut-offs instead of TPM values, as in Schoenfelder et al. Genome Research 2015.
Author response image 3.
Author response image 3.. Enrichment of in vivovalidated enhancers (from the Vista Enhancer Browser) in CM promoter-distal interactions.
Top, fold-enrichment of the observed number of enhancers compared to 1000 permutations of enhancer locations. Numbers above the error bars indicate the number of enhancer elements in each group. Bottom, corresponding Z-score for each enrichment. Heart enhancer data is highlighted in red.

Similar articles

Cited by

References

    1. Arking DE, Pulit SL, Crotti L, van der Harst P, Munroe PB, Koopmann TT, Sotoodehnia N, Rossin EJ, Morley M, Wang X, Johnson AD, Lundby A, Gudbjartsson DF, Noseworthy PA, Eijgelsheim M, Bradford Y, Tarasov KV, Dörr M, Müller-Nurasyid M, Lahtinen AM, Nolte IM, Smith AV, Bis JC, Isaacs A, Newhouse SJ, Evans DS, Post WS, Waggott D, Lyytikäinen LP, Hicks AA, Eisele L, Ellinghaus D, Hayward C, Navarro P, Ulivi S, Tanaka T, Tester DJ, Chatel S, Gustafsson S, Kumari M, Morris RW, Naluai ÅT, Padmanabhan S, Kluttig A, Strohmer B, Panayiotou AG, Torres M, Knoflach M, Hubacek JA, Slowikowski K, Raychaudhuri S, Kumar RD, Harris TB, Launer LJ, Shuldiner AR, Alonso A, Bader JS, Ehret G, Huang H, Kao WH, Strait JB, Macfarlane PW, Brown M, Caulfield MJ, Samani NJ, Kronenberg F, Willeit J, Smith JG, Greiser KH, Meyer Zu Schwabedissen H, Werdan K, Carella M, Zelante L, Heckbert SR, Psaty BM, Rotter JI, Kolcic I, Polašek O, Wright AF, Griffin M, Daly MJ, Arnar DO, Hólm H, Thorsteinsdottir U, Denny JC, Roden DM, Zuvich RL, Emilsson V, Plump AS, Larson MG, O'Donnell CJ, Yin X, Bobbo M, D'Adamo AP, Iorio A, Sinagra G, Carracedo A, Cummings SR, Nalls MA, Jula A, Kontula KK, Marjamaa A, Oikarinen L, Perola M, Porthan K, Erbel R, Hoffmann P, Jöckel KH, Kälsch H, Nöthen MM, den Hoed M, Loos RJ, Thelle DS, Gieger C, Meitinger T, Perz S, Peters A, Prucha H, Sinner MF, Waldenberger M, de Boer RA, Franke L, van der Vleuten PA, Beckmann BM, Martens E, Bardai A, Hofman N, Wilde AA, Behr ER, Dalageorgou C, Giudicessi JR, Medeiros-Domingo A, Barc J, Kyndt F, Probst V, Ghidoni A, Insolia R, Hamilton RM, Scherer SW, Brandimarto J, Margulies K, Moravec CE, del Greco M F, Fuchsberger C, O'Connell JR, Lee WK, Watt GC, Campbell H, Wild SH, El Mokhtari NE, Frey N, Asselbergs FW, Mateo Leach I, Navis G, van den Berg MP, van Veldhuisen DJ, Kellis M, Krijthe BP, Franco OH, Hofman A, Kors JA, Uitterlinden AG, Witteman JC, Kedenko L, Lamina C, Oostra BA, Abecasis GR, Lakatta EG, Mulas A, Orrú M, Schlessinger D, Uda M, Markus MR, Völker U, Snieder H, Spector TD, Ärnlöv J, Lind L, Sundström J, Syvänen AC, Kivimaki M, Kähönen M, Mononen N, Raitakari OT, Viikari JS, Adamkova V, Kiechl S, Brion M, Nicolaides AN, Paulweber B, Haerting J, Dominiczak AF, Nyberg F, Whincup PH, Hingorani AD, Schott JJ, Bezzina CR, Ingelsson E, Ferrucci L, Gasparini P, Wilson JF, Rudan I, Franke A, Mühleisen TW, Pramstaller PP, Lehtimäki TJ, Paterson AD, Parsa A, Liu Y, van Duijn CM, Siscovick DS, Gudnason V, Jamshidi Y, Salomaa V, Felix SB, Sanna S, Ritchie MD, Stricker BH, Stefansson K, Boyer LA, Cappola TP, Olsen JV, Lage K, Schwartz PJ, Kääb S, Chakravarti A, Ackerman MJ, Pfeufer A, de Bakker PI, Newton-Cheh C, CARe Consortium. COGENT Consortium. DCCT/EDIC. eMERGE Consortium. HRGEN Consortium Genetic association study of QT interval highlights role for calcium signaling pathways in myocardial repolarization. Nature Genetics. 2014;46:826–836. doi: 10.1038/ng.3014. - DOI - PMC - PubMed
    1. Arnolds DE, Liu F, Fahrenbach JP, Kim GH, Schillinger KJ, Smemo S, McNally EM, Nobrega MA, Patel VV, Moskowitz IP. TBX5 drives Scn5a expression to regulate cardiac conduction system function. Journal of Clinical Investigation. 2012;122:2509–2518. doi: 10.1172/JCI62617. - DOI - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. the gene ontology consortium. Nature Genetics. 2000;25:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Banovich NE, Li YI, Raj A, Ward MC, Greenside P, Calderon D, Tung PY, Burnett JE, Myrthil M, Thomas SM, Burrows CK, Romero IG, Pavlovic BJ, Kundaje A, Pritchard JK, Gilad Y. Impact of regulatory variation across human iPSCs and differentiated cells. Genome Research. 2018;28:122–131. doi: 10.1101/gr.224436.117. - DOI - PMC - PubMed
    1. Blake JA, Eppig JT, Kadin JA, Richardson JE, Smith CL, Bult CJ, the Mouse Genome Database Group Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. Nucleic Acids Research. 2017;45:D723–D729. doi: 10.1093/nar/gkw1040. - DOI - PMC - PubMed

Publication types

MeSH terms