Prognostic genes of triple-negative breast cancer identified by weighted gene co-expression network analysis

Oncol Lett. 2020 Jan;19(1):127-138. doi: 10.3892/ol.2019.11079. Epub 2019 Nov 11.

Abstract

Triple-negative breast cancer (TNBC) is characterized by a deficiency in the estrogen receptor (ER), progesterone receptor (PR) and HER2/neu genes. Patients with TNBC have an increased likelihood of distant recurrence and mortality, compared with patients with other subtypes of breast cancer. The current study aimed to identify novel biomarkers for TNBC. Weighted gene co-expression network analysis (WGCNA) was applied to construct gene co-expression networks; these were used to explore the correlation between mRNA profiles and clinical data, thus identifying the most significant co-expression network associated with the American Joint Committee on Cancer-TNM stage of TNBC. Using RNAseq datasets from The Cancer Genome Atlas, downloaded from the University of California, Santa Cruz, WGCNA identified 23 modules via K-means clustering. The most significant module consisted of 248 genes, on which gene ontology analysis was subsequently performed. Differently Expressed Gene (DEG) analysis was then applied to determine the DEGs between normal and tumor tissues. A total of 42 genes were positioned in the overlap between DEGs and the most significant module. Following survival analysis, 5 genes [GIPC PDZ domain containing family member 1 (GIPC1), hes family bHLH transcription factor 6 (HES6), calmodulin-regulated spectrin-associated protein family member 3 (KIAA1543), myosin light chain kinase 2 (MYLK2) and peter pan homolog (PPAN)] were selected and their association with the American Joint Committee on Cancer-TNM diagnostic stage was investigated. The expression level of these genes in different pathological stages varied, but tended to increase in more advanced pathological stages. The expression of these 5 genes exhibited accurate capacity for the identification of tumor and normal tissues via receiver operating characteristic curve analysis. High expression of GIPC1, HES6, KIAA1543, MYLK2 and PPAN resulted in poor overall survival (OS) in patients with TNBC. In conclusion, via unsupervised clustering methods, a co-expressed gene network with high inter-connectivity was constructed, and 5 genes were identified as biomarkers for TNBC.

Keywords: progression; triple-negative breast cancer; weighted gene co-expression network analysis.