Searching for a signature involving 10 genes to predict the survival of patients with acute myelocytic leukemia through a combined multi-omics analysis

PeerJ. 2020 Jun 25:8:e9437. doi: 10.7717/peerj.9437. eCollection 2020.

Abstract

Background: Currently, acute myelocytic leukemia (AML) still has a poor prognosis. As a result, gene markers for predicting AML prognosis must be identified through systemic analysis of multi-omics data.

Methods: First of all, the copy number variation (CNV), mutation, RNA-Seq, and single nucleotide polymorphism (SNP) data, as well as those clinical follow-up data, were obtained based on The Cancer Genome Atlas (TCGA) database. Thereafter, all samples (n = 229) were randomized as test set and training set, respectively. Of them, the training set was used to screen for genes related to prognosis, and genes with mutation, SNP or CNV. Then, shrinkage estimate was used for feature selection of all the as-screened genes, to select those stable biomarkers. Eventually, a prognosis model related to those genes was established, and validated within the GEO verification (n = 124 and 72) and test set (n = 127). Moreover, it was compared with the AML prognosis prediction model reported in literature.

Results: Altogether 832 genes related to prognosis, 23 related to copy amplification, 774 associated with copy deletion, and 189 with significant genomic variations were acquired in this study. Later, genes with genomic variations and those related to prognosis were integrated to obtain 38 candidate genes; eventually, a shrinkage estimate was adopted to obtain 10 feature genes (including FAT2, CAMK2A, TCERG1, GDF9, PTGIS, DOC2B, DNTTIP1, PREX1, CRISPLD1 and C22orf42). Further, a signature was established using these 10 genes based on Cox regression analysis, and it served as an independent factor to predict AML prognosis. More importantly, it was able to stratify those external verification, test and training set samples with regard to the risk (P < 0.01). Compared with the prognosis prediction model reported in literature, the model established in this study was advantageous in terms of the prediction performance.

Conclusion: The signature based on 10 genes had been established in this study, which is promising to be used to be a new marker for predicting AML prognosis.

Keywords: Acute myelocytic leukemia; Bioinformatics; CNV; Prognosis marker; TCGA.

Grants and funding

This work was supported by the Natural Science Foundation of Zhejiang Province (LY19H290003, LQ20H280002); the Zhejiang Provincial Medical and Health Science and Technology Project (2020KY196, 2018277310); the Foundation of Zhejiang province Chinese medicine science and technology planes (2017ZB030, 2020ZA044); and Key project of the 2017 school research fund of Zhejiang Chinese Medical University (2017ZZ02). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.