sc-REnF: An entropy guided robust feature selection for single-cell RNA-seq data

Brief Bioinform. 2022 Mar 10;23(2):bbab517. doi: 10.1093/bib/bbab517.

Abstract

Annotation of cells in single-cell clustering requires a homogeneous grouping of cell populations. Since single-cell data are susceptible to technical noise, the quality of genes selected prior to clustering is of crucial importance in the preliminary steps of downstream analysis. Therefore, interest in robust gene selection has gained considerable attention in recent years. We introduce sc-REnF [robust entropy based feature (gene) selection method], aiming to leverage the advantages of $R{\prime}{e}nyi$ and $Tsallis$ entropies in gene selection for single cell clustering. Experiments demonstrate that with tuned parameter ($q$), $R{\prime}{e}nyi$ and $Tsallis$ entropies select genes that improved the clustering results significantly, over the other competing methods. sc-REnF can capture relevancy and redundancy among the features of noisy data extremely well due to its robust objective function. Moreover, the selected features/genes can able to determine the unknown cells with a high accuracy. Finally, sc-REnF yields good clustering performance in small sample, large feature scRNA-seq data. Availability: The sc-REnF is available at https://github.com/Snehalikalall/sc-REnF.

Keywords: Renyi and Tsallis entropy; clustering; gene selection; single-cell data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Entropy
  • Gene Expression Profiling* / methods
  • RNA-Seq
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Whole Exome Sequencing