Evaluation of deep learning-based feature selection for single-cell RNA sequencing data analysis

Genome Biol. 2023 Nov 10;24(1):259. doi: 10.1186/s13059-023-03100-x.

Abstract

Background: Feature selection is an essential task in single-cell RNA-seq (scRNA-seq) data analysis and can be critical for gene dimension reduction and downstream analyses, such as gene marker identification and cell type classification. Most popular methods for feature selection from scRNA-seq data are based on the concept of differential distribution wherein a statistical model is used to detect changes in gene expression among cell types. Recent development of deep learning-based feature selection methods provides an alternative approach compared to traditional differential distribution-based methods in that the importance of a gene is determined by neural networks.

Results: In this work, we explore the utility of various deep learning-based feature selection methods for scRNA-seq data analysis. We sample from Tabula Muris and Tabula Sapiens atlases to create scRNA-seq datasets with a range of data properties and evaluate the performance of traditional and deep learning-based feature selection methods for cell type classification, feature selection reproducibility and diversity, and computational time.

Conclusions: Our study provides a reference for future development and application of deep learning-based feature selection methods for single-cell omics data analyses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Data Analysis
  • Deep Learning*
  • Gene Expression Profiling* / methods
  • Reproducibility of Results
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis / methods