Normalization and batch correction are critical steps in processing single-cell RNA sequencing (scRNA-seq) data, which remove technical effects and systematic biases to unmask biological signals of interest. Although a number of computational methods have been developed, there is no guidance for choosing appropriate procedures in different scenarios. In this study, we assessed the performance of 28 scRNA-seq noise reduction procedures in 55 scenarios using simulated and real datasets. The scenarios accounted for multiple biological and technical factors that greatly affect the denoising performance, including relative magnitude of batch effects, the extent of cell population imbalance, the complexity of cell group structures, the proportion and the similarity of nonoverlapping cell populations, dropout rates and variable library sizes. We used multiple quantitative metrics and visualization of low-dimensional cell embeddings to evaluate the performance on batch mixing while preserving the original cell group and gene structures. Based on our results, we specified technical or biological factors affecting the performance of each method and recommended proper methods in different scenarios. In addition, we highlighted one challenging scenario where most methods failed and resulted in overcorrection. Our studies not only provided a comprehensive guideline for selecting suitable noise reduction procedures but also pointed out unsolved issues in the field, especially the urgent need of developing metrics for assessing batch correction on imperceptible cell-type mixing.
Keywords: batch effect adjustment; bioinformatics; normalization; single-cell RNA sequencing.
© The Author(s) 2022. Published by Oxford University Press.