Metacells untangle large and complex single-cell transcriptome networks

BMC Bioinformatics. 2022 Aug 13;23(1):336. doi: 10.1186/s12859-022-04861-1.


Background: Single-cell RNA sequencing (scRNA-seq) technologies offer unique opportunities for exploring heterogeneous cell populations. However, in-depth single-cell transcriptomic characterization of complex tissues often requires profiling tens to hundreds of thousands of cells. Such large numbers of cells represent an important hurdle for downstream analyses, interpretation and visualization.

Results: We develop a framework called SuperCell to merge highly similar cells into metacells and perform standard scRNA-seq data analyses at the metacell level. Our systematic benchmarking demonstrates that metacells not only preserve but often improve the results of downstream analyses including visualization, clustering, differential expression, cell type annotation, gene correlation, imputation, RNA velocity and data integration. By capitalizing on the redundancy inherent to scRNA-seq data, metacells significantly facilitate and accelerate the construction and interpretation of single-cell atlases, as demonstrated by the integration of 1.46 million cells from COVID-19 patients in less than two hours on a standard desktop.

Conclusions: SuperCell is a framework to build and analyze metacells in a way that efficiently preserves the results of scRNA-seq data analyses while significantly accelerating and facilitating them.

Keywords: Coarse-graining; Computational biology; Metacells; Single-cell transcriptomics.

MeSH terms

  • COVID-19*
  • Cluster Analysis
  • Humans
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis / methods
  • Transcriptome*