HGC: fast hierarchical clustering for large-scale single-cell data
- PMID: 34096998
- DOI: 10.1093/bioinformatics/btab420
HGC: fast hierarchical clustering for large-scale single-cell data
Abstract
Summary: Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets.
Availability and implementation: The R package of HGC is available at https://bioconductor.org/packages/HGC/.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Similar articles
-
GMHCC: high-throughput analysis of biomolecular data using graph-based multiple hierarchical consensus clustering.Bioinformatics. 2022 May 26;38(11):3020-3028. doi: 10.1093/bioinformatics/btac290. Bioinformatics. 2022. PMID: 35451457
-
densityCut: an efficient and versatile topological approach for automatic clustering of biological data.Bioinformatics. 2016 Sep 1;32(17):2567-76. doi: 10.1093/bioinformatics/btw227. Epub 2016 Apr 23. Bioinformatics. 2016. PMID: 27153661 Free PMC article.
-
clustComp, a bioconductor package for the comparison of clustering results.Bioinformatics. 2017 Dec 15;33(24):4001-4003. doi: 10.1093/bioinformatics/btx532. Bioinformatics. 2017. PMID: 28961761 Free PMC article.
-
SCHNEL: scalable clustering of high dimensional single-cell data.Bioinformatics. 2020 Dec 30;36(Suppl_2):i849-i856. doi: 10.1093/bioinformatics/btaa816. Bioinformatics. 2020. PMID: 33381821
-
HCsnip: An R Package for Semi-supervised Snipping of the Hierarchical Clustering Tree.Cancer Inform. 2015 Mar 22;14:1-19. doi: 10.4137/CIN.S22080. eCollection 2015. Cancer Inform. 2015. PMID: 25861213 Free PMC article. Review.
Cited by
-
Integrated 4D label-free proteomics and data mining to elucidate the effects of thermal processing on crisp grass carp protein profiles.Curr Res Food Sci. 2024 Jan 19;8:100681. doi: 10.1016/j.crfs.2024.100681. eCollection 2024. Curr Res Food Sci. 2024. PMID: 38304000 Free PMC article.
-
Cellular features of localized microenvironments in human meniscal degeneration: a single-cell transcriptomic study.Elife. 2022 Dec 22;11:e79585. doi: 10.7554/eLife.79585. Elife. 2022. PMID: 36548025 Free PMC article.
-
Altered resting-state functional connectivity and dynamic network properties in cognitive impairment: an independent component and dominant-coactivation pattern analyses study.Front Aging Neurosci. 2024 Mar 18;16:1362613. doi: 10.3389/fnagi.2024.1362613. eCollection 2024. Front Aging Neurosci. 2024. PMID: 38562990 Free PMC article.
-
Single-cell omics: experimental workflow, data analyses and applications.Sci China Life Sci. 2025 Jan;68(1):5-102. doi: 10.1007/s11427-023-2561-0. Epub 2024 Jul 23. Sci China Life Sci. 2025. PMID: 39060615 Review.
-
JOINTLY: interpretable joint clustering of single-cell transcriptomes.Nat Commun. 2023 Dec 20;14(1):8473. doi: 10.1038/s41467-023-44279-8. Nat Commun. 2023. PMID: 38123569 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
