Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 9;20(1):437.
doi: 10.1186/s12916-022-02630-8.

Sequential gene expression analysis of cervical malignant transformation identifies RFC4 as a novel diagnostic and prognostic biomarker

Affiliations

Sequential gene expression analysis of cervical malignant transformation identifies RFC4 as a novel diagnostic and prognostic biomarker

Jianwei Zhang et al. BMC Med. .

Abstract

Background: Cervical squamous cell carcinoma (SCC) is known to arise through increasingly higher-grade squamous intraepithelial lesions (SILs) or cervical intraepithelial neoplasias (CINs). This study aimed to describe sequential molecular changes and identify biomarkers in cervical malignant transformation.

Methods: Multidimensional data from five publicly available microarray and TCGA-CESC datasets were analyzed. Immunohistochemistry was carried out on 354 cervical tissues (42 normal, 62 CIN1, 26 CIN2, 47 CIN3, and 177 SCC) to determine the potential diagnostic and prognostic value of identified biomarkers.

Results: We demonstrated that normal epithelium and SILs presented higher molecular homogeneity than SCC. Genes in the region (e.g., 3q, 12q13) with copy number alteration or HPV integration were more likely to lose or gain expression. The IL-17 signaling pathway was enriched throughout disease progression with downregulation of IL17C and decreased Th17 cells at late stage. Furthermore, we identified AURKA, TOP2A, RFC4, and CEP55 as potential causative genes gradually upregulated during the normal-SILs-SCC transition. For detecting high-grade SIL (HSIL), TOP2A and RFC4 showed balanced sensitivity (both 88.2%) and specificity (87.1 and 90.1%), with high AUC (0.88 and 0.89). They had equivalent diagnostic performance alone to the combination of p16INK4a and Ki-67. Meanwhile, increased expression of RFC4 significantly and independently predicted favorable outcomes in multi-institutional cohorts of SCC patients.

Conclusions: Our comprehensive study of gene expression profiling has identified dysregulated genes and biological processes during cervical carcinogenesis. RFC4 is proposed as a novel surrogate biomarker for determining HSIL and HSIL+, and an independent prognostic biomarker for SCC.

Keywords: Biomarkers; Cervical cancer; Molecular changes; RFC4; Squamous intraepithelial lesions.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Schematic workflow of the present study. TCGA-CESC, the cancer genome atlas-cervical squamous cell carcinoma and endocervical adenocarcinoma
Fig. 2
Fig. 2
Integrated analysis. A Principal component analysis of the most variable 1000 genes in three datasets. The first two principal components are displayed and colored according to the disease stage. The shaded ellipses represent the 95% confidence intervals. B Boxplots representing Pearson correlation coefficient distribution between cases at each stage. Kruskal-Wallis test followed by pairwise Wilcoxon rank-sum test with Bonferroni correction was used to compare correlations. *** p < 0.001. C Heatmap displaying Pearson correlations between pairwise comparisons for all samples. D Bar plot representing the number of DEGs in each disease stage across discovery datasets. E UpSet diagrams showing the intersection size of DEGs in HN and CN across discovery datasets. The colors of the matrix background represent up- (red) and downregulated (blue) genes. Orange bars represent the number of DEGs explained by unique genes to GPL570. Pale violet-red dots indicate inconsistently changed genes among different studies. F Venn diagram showing the number of shared and unique genes annotated in GPL96, GPL570, and GPL571. G Circos plot illustrating landscape of chromosomal positions, expression, CNVs, and significant chromosomal bands. Autosomes 1–22 and sex chromosome X are shown in the right half of the circle. Zoomed chromosomes 1, 3, and 19 are displayed in the left half. Tracks from innermost to outermost: expression heatmap of Sets1 genes along lesion severity gradient (LN-HN-CN, Track 1-3), frequency of gains (red), and losses (blue) for regions of each chromosome from TCGA-SCC (Track 4), chromosome cytobands (Track 5), and significantly enriched cytobands in zoomed chromosomes (Track 6). H Dot plot showing the results of cytoband enrichment. Dot color indicates the q-value of the enrichment test; dot size represents the fraction of genes annotated to each cytoband. Q-value < 0.05 is considered as statistically significant
Fig. 3
Fig. 3
Functional enrichment and Th17 cell infiltration. A Heatmap showing the GO biological process terms enriched by up- and downregulated genes (Gene Sets1, 1 to 6 columns) and total DEGs (7 to 9 columns) of each comparison. The simplified GO terms significantly enriched by at least one comparison group were included and hierarchically clustered. Eight distinct clusters of GO terms that show high semantic similarity were identified. The color intensities indicate the −log10(q-value) of the enrichment test. See Additional file 2: Table S8 for the entire list of the enriched GO terms. B Dot plot showing the top 8 significantly enriched KEGG pathways in up- and downregulated genes (Gene Sets1, left panel) and total DEGs (right panel) of each comparison. Dot color indicates the q-value of the enrichment test; dot size represents the fraction of genes annotated to each pathway. The entire list of the enriched pathways and comparison can be seen in Additional file 2: Table S9 and Additional file 1: Fig. S3. C Volcano plots of GSE63514. Red and blue dots represent up- and downregulated genes, respectively; gray dots represent non-statistically significant genes. Vertical dashed lines indicate a 2-fold change cutoff in either direction, and horizontal dashed lines indicate an adjusted p-value cutoff of 0.05. IL17A through IL17F were circled and labeled with gene symbols. D Boxplots showing the abundance of Th17 cells changes over the disease stages. ** p < 0.01; * p < 0.05; † p < 0.1
Fig. 4
Fig. 4
Hub genes identification and validation. A PPI network of important DEGs selected by Cytohubba. The nodes with white and blue rings denote progressively up- and downregulated genes with the development of cervical lesions (Spearman, p < 0.05). Edge thickness is proportional to the interaction score. B Boxplots showing the correlations between hub gene expression and severity of cervical lesion in GSE138080, with Spearman’s rho and p-values presented in the upper left corner. C Scatter plots showing the correlations between hub gene expression (Z score-transformed log2 (FPKM-UQ+1) values) and CNAs in TCGA-CESC dataset, with Spearman’s rho and p-values presented in the lower right corner. Adenocarcinoma (AC) samples are shown in black and squamous cell carcinoma (SCC) samples are shown in blue. D Boxplots showing hub gene expression in HPV-neg, HPV-S (HPV16 persistent infection without progression), and HPV-P (HPV16 persistent infection with progression) women from GSE75132. Statistical comparisons were performed using Wilcoxon rank-sum test. * 0.01 < p < 0.05; † p < 0.1; ns, p ≥ 0.1
Fig. 5
Fig. 5
p16INK4A, Ki-67, AURKA, TOP2A, RFC4, and CEP55 immunohistochemistry and potential diagnostic utility. A Heatmap illustrating the hierarchical clustering of samples (columns) from discovery datasets based on the scaled expression of hub genes (rows). Blue to red spectrum color gradient indicates low to high expression level. B Representative IHC staining images of tested markers in normal, CIN1-3, SCC tissues. Original magnification ×400. Inserts, original magnification ×100. C Stacked bar plots showing the fraction of positively (orange) and negatively staining (blue) samples in each disease stage, with positive rates presented. The p-values indicate the difference in the distribution of positive and negative samples between normal/LSIL and HSIL (chi-square test). The numbers above each bar refer to the number of samples in each stage. D Heatmap of kappa statistics for tested IHC markers in HSIL (blue) and all stages (red), related to Additional file 1: Fig. S13. E ROC curves for comparison of single and combined biomarkers in HSIL diagnosis and associated AUC values were shown (also see Table 1). ROC, receiver operating characteristic; AUC, area under the ROC curve
Fig. 6
Fig. 6
Univariate and multivariate survival analysis for OS in SCC patients, related to Additional file 1: Fig. S9. Five-year Kaplan-Meier curves for OS in SCC patients stratified by the hub gene expression (mRNA and protein) from A TCGA, B TJH, and C Extended cohorts. The number of cases and events are shown in the plots. The p-values were calculated with the log-rank test. The optimal cutoff values for HSCORE or staining intensity determined by the surv_cutpoint function from the survminer package were 130 for AURKA, 1 (staining intensity) for CEP55, 45 for TOP2A, and 100/105 (TJH/Extended) for RFC4. D Representative IHC staining images of high and low AURKA, TOP2A, RFC4, and CEP55 expression in SCC. Original magnification ×200. Inserts, original magnification ×100. E Forest plot of multivariate Cox regression with clinical features and RFC4 expression taken into account in three cohorts. The main effects are shown as hazard ratios with 95% confidence intervals. F Time-dependent AUC for combined RFC4 expression and clinical covariate model (red) and clinical covariate-only model (blue). The significant difference in the AUC was estimated at 1, 2, 3, 4, and 5 years, and adjusted p-values were calculated. HR, hazard ratio; CI, confidence interval. * 0.01 < p < 0.05; ns, p ≥ 0.1

Similar articles

Cited by

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA-Cancer J Clin. 2021;71(3):209–249. - PubMed
    1. Vinh-Hung V, Bourgain C, Vlastos G, Cserni G, De Ridder M, Storme G, et al. Prognostic value of histopathology and trends in cervical cancer: a SEER population study. BMC Cancer. 2007;7(1):164. - PMC - PubMed
    1. Schiffman M, Castle PE, Jeronimo J, Rodriguez AC, Wacholder S. Human papillomavirus and cervical cancer. Lancet. 2007;370(9590):890–907. - PubMed
    1. Moch H. Female genital tumours : WHO Classification of Tumours, 5th Edition, Volume 4. Lyon: International Agency for Research on Cancer; 2020.
    1. Caffarel MM, Chattopadhyay A, Araujo AM, Bauer J, Scarpini CG, Coleman N. Tissue transglutaminase mediates the pro-malignant effects of oncostatin M receptor over-expression in cervical squamous cell carcinoma. J Pathol. 2013;231(2):168–179. - PMC - PubMed

Publication types

MeSH terms