Single sample scoring of molecular phenotypes
- PMID: 30400809
- PMCID: PMC6219008
- DOI: 10.1186/s12859-018-2435-4
Single sample scoring of molecular phenotypes
Abstract
Background: Gene set scoring provides a useful approach for quantifying concordance between sample transcriptomes and selected molecular signatures. Most methods use information from all samples to score an individual sample, leading to unstable scores in small data sets and introducing biases from sample composition (e.g. varying numbers of samples for different cancer subtypes). To address these issues, we have developed a truly single sample scoring method, and associated R/Bioconductor package singscore ( https://bioconductor.org/packages/singscore ).
Results: We use multiple cancer data sets to compare singscore against widely-used methods, including GSVA, z-score, PLAGE, and ssGSEA. Our approach does not depend upon background samples and scores are thus stable regardless of the composition and number of samples being scored. In contrast, scores obtained by GSVA, z-score, PLAGE and ssGSEA can be unstable when less data are available (NS < 25). The singscore method performs as well as the best performing methods in terms of power, recall, false positive rate and computational time, and provides consistently high and balanced performance across all these criteria. To enhance the impact and utility of our method, we have also included a set of functions implementing visual analysis and diagnostics to support the exploration of molecular phenotypes in single samples and across populations of data.
Conclusions: The singscore method described here functions independent of sample composition in gene expression data and thus it provides stable scores, which are particularly useful for small data sets or data integration. Singscore performs well across all performance criteria, and includes a suite of powerful visualization functions to assist in the interpretation of results. This method performs as well as or better than other scoring approaches in terms of its power to distinguish samples with distinct biology and its ability to call true differential gene sets between two conditions. These scores can be used for dimensional reduction of transcriptomic data and the phenotypic landscapes obtained by scoring samples against multiple molecular signatures may provide insights for sample stratification.
Keywords: Dimensional reduction; Gene set enrichment; Gene set score; Gene signature; Molecular features; Molecular phenotypes; Personalised medicine; Single sample; Singscore; Transcriptome.
Conflict of interest statement
Ethics approval and consent to participate
NA
Consent for publication
NA
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures
Similar articles
-
Cross-platform comparison of immune signatures in immunotherapy-treated patients with advanced melanoma using a rank-based scoring approach.J Transl Med. 2023 Apr 13;21(1):257. doi: 10.1186/s12967-023-04092-9. J Transl Med. 2023. PMID: 37055772 Free PMC article.
-
Using singscore to predict mutation status in acute myeloid leukemia from transcriptomic signatures.F1000Res. 2019 Jun 3;8:776. doi: 10.12688/f1000research.19236.3. eCollection 2019. F1000Res. 2019. PMID: 31723419 Free PMC article.
-
Comparison of gene set scoring methods for reproducible evaluation of multiple tuberculosis gene signatures.bioRxiv [Preprint]. 2023 Jan 30:2023.01.19.520627. doi: 10.1101/2023.01.19.520627. bioRxiv. 2023. Update in: BMC Infect Dis. 2024 Jun 20;24(1):610. doi: 10.1186/s12879-024-09457-z PMID: 36711818 Free PMC article. Updated. Preprint.
-
A review of digital cytometry methods: estimating the relative abundance of cell types in a bulk of cells.Brief Bioinform. 2021 Jul 20;22(4):bbaa219. doi: 10.1093/bib/bbaa219. Brief Bioinform. 2021. PMID: 33003193 Free PMC article. Review.
-
From bench to bedside: Single-cell analysis for cancer immunotherapy.Cancer Cell. 2021 Aug 9;39(8):1062-1080. doi: 10.1016/j.ccell.2021.07.004. Epub 2021 Jul 29. Cancer Cell. 2021. PMID: 34329587 Free PMC article. Review.
Cited by
-
Transcriptomic profiling of cardiac tissues from SARS-CoV-2 patients identifies DNA damage.Immunology. 2023 Mar;168(3):403-419. doi: 10.1111/imm.13577. Epub 2022 Sep 27. Immunology. 2023. PMID: 36107637 Free PMC article.
-
Application of Immune Infiltration Signature and Machine Learning Model in the Differential Diagnosis and Prognosis of Bone-Related Malignancies.Front Cell Dev Biol. 2021 Apr 15;9:630355. doi: 10.3389/fcell.2021.630355. eCollection 2021. Front Cell Dev Biol. 2021. PMID: 33937231 Free PMC article.
-
Somatic mouse models of gastric cancer reveal genotype-specific features of metastatic disease.Nat Cancer. 2024 Feb;5(2):315-329. doi: 10.1038/s43018-023-00686-w. Epub 2024 Jan 4. Nat Cancer. 2024. PMID: 38177458 Free PMC article.
-
Functional requirement of alternative splicing in epithelial-mesenchymal transition of pancreatic circulating tumor.Mol Ther Nucleic Acids. 2024 Jan 24;35(1):102129. doi: 10.1016/j.omtn.2024.102129. eCollection 2024 Mar 12. Mol Ther Nucleic Acids. 2024. PMID: 38370981 Free PMC article.
-
A pyroptosis-related prognosis model to predict survival in colorectal cancer patients.Int J Clin Exp Pathol. 2022 Apr 15;15(4):168-182. eCollection 2022. Int J Clin Exp Pathol. 2022. PMID: 35535204 Free PMC article.
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
