Spectrum: fast density-aware spectral clustering for single and multi-omic data
- PMID: 31501851
- PMCID: PMC7703791
- DOI: 10.1093/bioinformatics/btz704
Spectrum: fast density-aware spectral clustering for single and multi-omic data
Abstract
Motivation: Clustering patient omic data is integral to developing precision medicine because it allows the identification of disease subtypes. A current major challenge is the integration multi-omic data to identify a shared structure and reduce noise. Cluster analysis is also increasingly applied on single-omic data, for example, in single cell RNA-seq analysis for clustering the transcriptomes of individual cells. This technology has clinical implications. Our motivation was therefore to develop a flexible and effective spectral clustering tool for both single and multi-omic data.
Results: We present Spectrum, a new spectral clustering method for complex omic data. Spectrum uses a self-tuning density-aware kernel we developed that enhances the similarity between points that share common nearest neighbours. It uses a tensor product graph data integration and diffusion procedure to reduce noise and reveal underlying structures. Spectrum contains a new method for finding the optimal number of clusters (K) involving eigenvector distribution analysis. Spectrum can automatically find K for both Gaussian and non-Gaussian structures. We demonstrate across 21 real expression datasets that Spectrum gives improved runtimes and better clustering results relative to other methods.
Availability and implementation: Spectrum is available as an R software package from CRAN https://cran.r-project.org/web/packages/Spectrum/index.html.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.
Figures
Similar articles
-
Spectral clustering of single-cell multi-omics data on multilayer graphs.Bioinformatics. 2022 Jul 11;38(14):3600-3608. doi: 10.1093/bioinformatics/btac378. Bioinformatics. 2022. PMID: 35652725 Free PMC article.
-
CAMML with the Integration of Marker Proteins (ChIMP).Bioinformatics. 2022 Nov 30;38(23):5206-5213. doi: 10.1093/bioinformatics/btac674. Bioinformatics. 2022. PMID: 36214642 Free PMC article.
-
NEMO: cancer subtyping by integration of partial multi-omic data.Bioinformatics. 2019 Sep 15;35(18):3348-3356. doi: 10.1093/bioinformatics/btz058. Bioinformatics. 2019. PMID: 30698637 Free PMC article.
-
IOAT: an interactive tool for statistical analysis of omics data and clinical data.BMC Bioinformatics. 2021 Jun 15;22(1):326. doi: 10.1186/s12859-021-04253-x. BMC Bioinformatics. 2021. PMID: 34130622 Free PMC article. Review.
-
Multi-omic single cell sequencing: Overview and opportunities for kidney disease therapeutic development.Front Mol Biosci. 2023 Apr 5;10:1176856. doi: 10.3389/fmolb.2023.1176856. eCollection 2023. Front Mol Biosci. 2023. PMID: 37091871 Free PMC article. Review.
Cited by
-
CHOIR improves significance-based detection of cell types and states from single-cell data.bioRxiv [Preprint]. 2024 Jan 23:2024.01.18.576317. doi: 10.1101/2024.01.18.576317. bioRxiv. 2024. PMID: 38328105 Free PMC article. Preprint.
-
Multi-omics integration with weighted affinity and self-diffusion applied for cancer subtypes identification.J Transl Med. 2024 Jan 19;22(1):79. doi: 10.1186/s12967-024-04864-x. J Transl Med. 2024. PMID: 38243340 Free PMC article.
-
A multi-omics data analysis workflow packaged as a FAIR Digital Object.Gigascience. 2024 Jan 2;13:giad115. doi: 10.1093/gigascience/giad115. Gigascience. 2024. PMID: 38217405 Free PMC article.
-
netMUG: a novel network-guided multi-view clustering workflow for dissecting genetic and facial heterogeneity.Front Genet. 2023 Dec 6;14:1286800. doi: 10.3389/fgene.2023.1286800. eCollection 2023. Front Genet. 2023. PMID: 38125750 Free PMC article.
-
Normalizing need not be the norm: count-based math for analyzing single-cell data.Theory Biosci. 2024 Feb;143(1):45-62. doi: 10.1007/s12064-023-00408-x. Epub 2023 Nov 10. Theory Biosci. 2024. PMID: 37947999
References
-
- Camp J.G. et al. (2017) Multilineage communication regulates human liver bud development from pluripotency. Nature, 546, 533. - PubMed
