Trans-ethnic predicted expression genome-wide association analysis identifies a gene for estrogen receptor-negative breast cancer

PLoS Genet. 2017 Sep 28;13(9):e1006727. doi: 10.1371/journal.pgen.1006727. eCollection 2017 Sep.


Genome-wide association studies (GWAS) have identified more than 90 susceptibility loci for breast cancer, but the underlying biology of those associations needs to be further elucidated. More genetic factors for breast cancer are yet to be identified but sample size constraints preclude the identification of individual genetic variants with weak effects using traditional GWAS methods. To address this challenge, we utilized a gene-level expression-based method, implemented in the MetaXcan software, to predict gene expression levels for 11,536 genes using expression quantitative trait loci and examine the genetically-predicted expression of specific genes for association with overall breast cancer risk and estrogen receptor (ER)-negative breast cancer risk. Using GWAS datasets from a Challenge launched by National Cancer Institute, we identified TP53INP2 (tumor protein p53-inducible nuclear protein 2) at 20q11.22 to be significantly associated with ER-negative breast cancer (Z = -5.013, p = 5.35×10-7, Bonferroni threshold = 4.33×10-6). The association was consistent across four GWAS datasets, representing European, African and Asian ancestry populations. There are 6 single nucleotide polymorphisms (SNPs) included in the prediction of TP53INP2 expression and five of them were associated with estrogen-receptor negative breast cancer, although none of the SNP-level associations reached genome-wide significance. We conducted a replication study using a dataset outside of the Challenge, and found the association between TP53INP2 and ER-negative breast cancer was significant (p = 5.07x10-3). Expression of HP (16q22.2) showed a suggestive association with ER-negative breast cancer in the discovery phase (Z = 4.30, p = 1.70x10-5) although the association was not significant after Bonferroni adjustment. Of the 249 genes that are 250 kb within known breast cancer susceptibility loci identified from previous GWAS, 20 genes (8.0%) were statistically significant associated with ER-negative breast cancer (p<0.05), compared to 582 (5.2%) of 11,287 genes that are not close to previous GWAS loci. This study demonstrated that expression-based gene mapping is a promising approach for identifying cancer susceptibility genes.

MeSH terms

  • Breast Neoplasms / genetics*
  • Breast Neoplasms / pathology
  • Estrogen Receptor alpha / genetics*
  • Female
  • Gene Expression Regulation, Neoplastic
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study
  • Haptoglobins / genetics*
  • Humans
  • Nuclear Proteins / genetics*
  • Polymorphism, Single Nucleotide


  • ESR1 protein, human
  • Estrogen Receptor alpha
  • HP protein, human
  • Haptoglobins
  • Nuclear Proteins
  • TP53INP2 protein, human