Consensus-based clustering of single cells by reconstructing cell-to-cell dissimilarity

Brief Bioinform. 2021 Sep 21;bbab379. doi: 10.1093/bib/bbab379. Online ahead of print.


The development of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) technology has led to great opportunities for the identification of heterogeneous cell types in complex tissues. Clustering algorithms are of great importance to effectively identify different cell types. In addition, the definition of the distance between each two cells is a critical step for most clustering algorithms. In this study, we found that different distance measures have considerably different effects on clustering algorithms. Moreover, there is no specific distance measure that is applicable to all datasets. In this study, we introduce a new single-cell clustering method called SD-h, which generates an applicable distance measure for different kinds of datasets by optimally synthesizing commonly used distance measures. Then, hierarchical clustering is performed based on the new distance measure for more accurate cell-type clustering. SD-h was tested on nine frequently used scRNA-seq datasets and it showed great superiority over almost all the compared leading single-cell clustering algorithms.

Keywords: consensus matrix; single-cell RNA-seq; single-cell clustering; synthesis distance matrix.