iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement

Brief Bioinform. 2021 Sep 2;22(5):bbab122. doi: 10.1093/bib/bbab122.


Batch effect correction is an essential step in the integrative analysis of multiple single-cell RNA-sequencing (scRNA-seq) data. One state-of-the-art strategy for batch effect correction is via unsupervised or supervised detection of mutual nearest neighbors (MNNs). However, both types of methods only detect MNNs across batches of uncorrected data, where the large batch effects may affect the MNN search. To address this issue, we presented a batch effect correction approach via iterative supervised MNN (iSMNN) refinement across data after correction. Our benchmarking on both simulation and real datasets showed the advantages of the iterative refinement of MNNs on the performance of correction. Compared to popular alternative methods, our iSMNN is able to better mix the cells of the same cell type across batches. In addition, iSMNN can also facilitate the identification of differentially expressed genes (DEGs) that are relevant to the biological function of certain cell types. These results indicated that iSMNN will be a valuable method for integrating multiple scRNA-seq datasets that can facilitate biological and medical studies at single-cell level.

Keywords: batch effect correction; iterative refinement; mutual nearest neighbor; single-cell RNA-seq.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Benchmarking / methods
  • Cells, Cultured
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Humans
  • Mice
  • Reproducibility of Results
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*