A comparison of deep learning-based pre-processing and clustering approaches for single-cell RNA sequencing data

Brief Bioinform. 2022 Jan 17;23(1):bbab345. doi: 10.1093/bib/bbab345.

Abstract

The emergence of single cell RNA sequencing has facilitated the studied of genomes, transcriptomes and proteomes. As available single-cell RNA-seq datasets are released continuously, one of the major challenges facing traditional RNA analysis tools is the high-dimensional, high-sparsity, high-noise and large-scale characteristics of single-cell RNA-seq data. Deep learning technologies match the characteristics of single-cell RNA-seq data perfectly and offer unprecedented promise. Here, we give a systematic review for most popular single-cell RNA-seq analysis methods and tools based on deep learning models, involving the procedures of data preprocessing (quality control, normalization, data correction, dimensionality reduction and data visualization) and clustering task for downstream analysis. We further evaluate the deep model-based analysis methods of data correction and clustering quantitatively on 11 gold standard datasets. Moreover, we discuss the data preferences of these methods and their limitations, and give some suggestions and guidance for users to select appropriate methods and tools.

Keywords: clustering analysis; deep learning; pre-processing steps; single-cell RNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't
  • Systematic Review

MeSH terms

  • Cluster Analysis
  • Deep Learning*
  • Gene Expression Profiling / methods
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods