Integrative Genomic Analysis Predicts Causative Cis-Regulatory Mechanisms of the Breast Cancer-Associated Genetic Variant rs4415084

Cancer Res. 2018 Apr 1;78(7):1579-1591. doi: 10.1158/0008-5472.CAN-17-3486. Epub 2018 Jan 19.


Previous genome-wide association studies (GWAS) have identified several common genetic variants that may significantly modulate cancer susceptibility. However, the precise molecular mechanisms behind these associations remain largely unknown; it is often not clear whether discovered variants are themselves functional or merely genetically linked to other functional variants. Here, we provide an integrated method for identifying functional regulatory variants associated with cancer and their target genes by combining analyses of expression quantitative trait loci, a modified version of allele-specific expression that systematically utilizes haplotype information, transcription factor (TF)-binding preference, and epigenetic information. Application of our method to a breast cancer susceptibility region in 5p12 demonstrates that the risk allele rs4415084-T correlates with higher expression levels of the protein-coding gene mitochondrial ribosomal protein S30 (MRPS30) and lncRNA RP11-53O19.1 We propose an intergenic SNP rs4321755, in linkage disequilibrium (LD) with the GWAS SNP rs4415084 (r2 = 0.988), to be the predicted functional SNP. The risk allele rs4321755-T, in phase with the GWAS rs4415084-T, created a GATA3-binding motif within an enhancer, resulting in differential GATA3 binding and chromatin accessibility, thereby promoting transcription of MRPS30 and RP11-53O19.1. MRPS30 encodes a member of the mitochondrial ribosomal proteins, implicating the role of risk SNP in modulating mitochondrial activities in breast cancer. Our computational framework provides an effective means to integrate GWAS results with high-throughput genomic and epigenomic data and can be extended to facilitate rapid functional characterization of other genetic variants modulating cancer susceptibility.Significance: Unification of GWAS results with information from high-throughput genomic and epigenomic profiles provides a direct link between common genetic variants and measurable molecular perturbations. Cancer Res; 78(7); 1579-91. ©2018 AACR.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Breast Neoplasms / genetics*
  • Cell Line, Tumor
  • Chromosomes, Human, Pair 5 / genetics*
  • GATA3 Transcription Factor / metabolism*
  • Gene Expression Regulation, Neoplastic / genetics
  • Genetic Predisposition to Disease / genetics
  • Genome-Wide Association Study
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mitochondrial Proteins / genetics*
  • Mitochondrial Proteins / metabolism
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci
  • RNA, Long Noncoding / biosynthesis
  • RNA, Long Noncoding / genetics*
  • Regulatory Sequences, Nucleic Acid / genetics
  • Ribosomal Proteins / genetics*
  • Ribosomal Proteins / metabolism


  • GATA3 Transcription Factor
  • GATA3 protein, human
  • MRPS30 protein, human
  • Mitochondrial Proteins
  • RNA, Long Noncoding
  • Ribosomal Proteins