Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors
- PMID: 29608177
- PMCID: PMC6152897
- DOI: 10.1038/nbt.4091
Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors
Abstract
Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.
Conflict of interest statement
The authors declare no competing financial interests.
Figures
Similar articles
-
iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement.Brief Bioinform. 2021 Sep 2;22(5):bbab122. doi: 10.1093/bib/bbab122. Brief Bioinform. 2021. PMID: 33839756 Free PMC article.
-
A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics.Genome Res. 2021 Oct;31(10):1753-1766. doi: 10.1101/gr.271874.120. Epub 2021 May 25. Genome Res. 2021. PMID: 34035047 Free PMC article.
-
SMNN: batch effect correction for single-cell RNA-seq data via supervised mutual nearest neighbor detection.Brief Bioinform. 2021 May 20;22(3):bbaa097. doi: 10.1093/bib/bbaa097. Brief Bioinform. 2021. PMID: 32591778 Free PMC article.
-
Normalization for Single-Cell RNA-Seq Data Analysis.Methods Mol Biol. 2019;1935:11-23. doi: 10.1007/978-1-4939-9057-3_2. Methods Mol Biol. 2019. PMID: 30758817 Review.
-
Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.Brief Bioinform. 2021 Nov 5;22(6):bbab304. doi: 10.1093/bib/bbab304. Brief Bioinform. 2021. PMID: 34374742 Review.
Cited by
-
A latent subset of human hematopoietic stem cells resists regenerative stress to preserve stemness.Nat Immunol. 2021 Jun;22(6):723-734. doi: 10.1038/s41590-021-00925-1. Epub 2021 May 6. Nat Immunol. 2021. PMID: 33958784
-
Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods.Nat Protoc. 2021 Jun;16(6):2749-2764. doi: 10.1038/s41596-021-00534-0. Epub 2021 May 24. Nat Protoc. 2021. PMID: 34031612 Review.
-
Blinatumomab-induced T cell activation at single cell transcriptome resolution.BMC Genomics. 2021 Mar 1;22(1):145. doi: 10.1186/s12864-021-07435-2. BMC Genomics. 2021. PMID: 33648458 Free PMC article.
-
scDiffusion: conditional generation of high-quality single-cell data using diffusion model.Bioinformatics. 2024 Sep 2;40(9):btae518. doi: 10.1093/bioinformatics/btae518. Bioinformatics. 2024. PMID: 39171840 Free PMC article.
-
Cellular adaptation to cancer therapy along a resistance continuum.Nature. 2024 Jul;631(8022):876-883. doi: 10.1038/s41586-024-07690-9. Epub 2024 Jul 10. Nature. 2024. PMID: 38987605
References
-
- Jaitin Diego Adhemar, Kenigsberg Ephraim, Keren-Shaul Hadas, Elefant Naama, Paul Franziska, Zaretsky Irina, Mildner Alexander, Cohen Nadav, Jung Steffen, Tanay Amos, et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science. 2014;343(6172):776–779. - PMC - PubMed
-
- Macosko Evan Z, Basu Anindita, Satija Rahul, Nemesh James, Shekhar Karthik, Goldman Melissa, Tirosh Itay, Bialas Allison R, Kamitaki Nolan, Martersteck Emily M, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161(5):1202–1214. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
