Subclass mapping: identifying common subtypes in independent disease data sets
- PMID: 18030330
- PMCID: PMC2065909
- DOI: 10.1371/journal.pone.0001195
Subclass mapping: identifying common subtypes in independent disease data sets
Abstract
Whole genome expression profiles are widely used to discover molecular subtypes of diseases. A remaining challenge is to identify the correspondence or commonality of subtypes found in multiple, independent data sets generated on various platforms. While model-based supervised learning is often used to make these connections, the models can be biased to the training data set and thus miss inherent, relevant substructure in the test data. Here we describe an unsupervised subclass mapping method (SubMap), which reveals common subtypes between independent data sets. The subtypes within a data set can be determined by unsupervised clustering or given by predetermined phenotypes before applying SubMap. We define a measure of correspondence for subtypes and evaluate its significance building on our previous work on gene set enrichment analysis. The strength of the SubMap method is that it does not impose the structure of one data set upon another, but rather uses a bi-directional approach to highlight the common substructures in both. We show how this method can reveal the correspondence between several cancer-related data sets. Notably, it identifies common subtypes of breast cancer associated with estrogen receptor status, and a subgroup of lymphoma patients who share similar survival patterns, thus improving the accuracy of a clinical outcome predictor.
Conflict of interest statement
Figures
Similar articles
-
Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms.BMC Genomics. 2006 May 26;7:127. doi: 10.1186/1471-2164-7-127. BMC Genomics. 2006. PMID: 16729877 Free PMC article.
-
Challenges in projecting clustering results across gene expression-profiling datasets.J Natl Cancer Inst. 2007 Nov 21;99(22):1715-23. doi: 10.1093/jnci/djm216. Epub 2007 Nov 13. J Natl Cancer Inst. 2007. PMID: 18000217
-
Identification of subtypes in human epidermal growth factor receptor 2--positive breast cancer reveals a gene signature prognostic of outcome.J Clin Oncol. 2010 Apr 10;28(11):1813-20. doi: 10.1200/JCO.2009.22.8775. Epub 2010 Mar 15. J Clin Oncol. 2010. PMID: 20231686
-
Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders.Pac Symp Biocomput. 2015;20:132-43. Pac Symp Biocomput. 2015. PMID: 25592575 Free PMC article.
-
Impact of molecular subtypes classification concordance between preoperative core needle biopsy and surgical specimen on early breast cancer management: Single-institution experience and review of published literature.Eur J Surg Oncol. 2017 Apr;43(4):642-648. doi: 10.1016/j.ejso.2016.10.025. Epub 2016 Nov 17. Eur J Surg Oncol. 2017. PMID: 27889196 Review.
Cited by
-
A melanoma subtype with intrinsic resistance to BRAF inhibition identified by receptor tyrosine kinases gene-driven classification.Oncotarget. 2015 Mar 10;6(7):5118-33. doi: 10.18632/oncotarget.3007. Oncotarget. 2015. PMID: 25742786 Free PMC article.
-
The cuproptosis-related signature associated with the tumor environment and prognosis of patients with glioma.Front Immunol. 2022 Aug 30;13:998236. doi: 10.3389/fimmu.2022.998236. eCollection 2022. Front Immunol. 2022. PMID: 36110851 Free PMC article.
-
Comprehensive literature review and statistical considerations for microarray meta-analysis.Nucleic Acids Res. 2012 May;40(9):3785-99. doi: 10.1093/nar/gkr1265. Epub 2012 Jan 19. Nucleic Acids Res. 2012. PMID: 22262733 Free PMC article. Review.
-
Necroptosis-based glioblastoma prognostic subtypes: implications for TME remodeling and therapy response.Ann Med. 2024 Dec;56(1):2405079. doi: 10.1080/07853890.2024.2405079. Epub 2024 Oct 10. Ann Med. 2024. PMID: 39387496 Free PMC article.
-
A cuproptosis-related lncRNAs signature for prognosis, chemotherapy, and immune checkpoint blockade therapy of low-grade glioma.Front Mol Biosci. 2022 Aug 17;9:966843. doi: 10.3389/fmolb.2022.966843. eCollection 2022. Front Mol Biosci. 2022. PMID: 36060266 Free PMC article.
References
-
- Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet. 2005;365:488–492. - PubMed
-
- Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J. Independence and reproducibility across microarray platforms. Nat Methods. 2005;2:337–344. - PubMed
-
- Fisher RA. London: Oliver and Boyd; 1932. Statistical Methods for Research Workers.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
