Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors
- PMID: 29608177
- PMCID: PMC6152897
- DOI: 10.1038/nbt.4091
Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors
Abstract
Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.
Conflict of interest statement
The authors declare no competing financial interests.
Figures
Similar articles
-
iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement.Brief Bioinform. 2021 Sep 2;22(5):bbab122. doi: 10.1093/bib/bbab122. Brief Bioinform. 2021. PMID: 33839756 Free PMC article.
-
A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics.Genome Res. 2021 Oct;31(10):1753-1766. doi: 10.1101/gr.271874.120. Epub 2021 May 25. Genome Res. 2021. PMID: 34035047 Free PMC article.
-
SMNN: batch effect correction for single-cell RNA-seq data via supervised mutual nearest neighbor detection.Brief Bioinform. 2021 May 20;22(3):bbaa097. doi: 10.1093/bib/bbaa097. Brief Bioinform. 2021. PMID: 32591778 Free PMC article.
-
Normalization for Single-Cell RNA-Seq Data Analysis.Methods Mol Biol. 2019;1935:11-23. doi: 10.1007/978-1-4939-9057-3_2. Methods Mol Biol. 2019. PMID: 30758817 Review.
-
Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.Brief Bioinform. 2021 Nov 5;22(6):bbab304. doi: 10.1093/bib/bbab304. Brief Bioinform. 2021. PMID: 34374742 Review.
Cited by
-
Single-cell analyses reveal the clonal and molecular aetiology of Flt3L-induced emergency dendritic cell development.Nat Cell Biol. 2021 Mar;23(3):219-231. doi: 10.1038/s41556-021-00636-7. Epub 2021 Mar 1. Nat Cell Biol. 2021. PMID: 33649477
-
HypoMap-a unified single-cell gene expression atlas of the murine hypothalamus.Nat Metab. 2022 Oct;4(10):1402-1419. doi: 10.1038/s42255-022-00657-y. Epub 2022 Oct 20. Nat Metab. 2022. PMID: 36266547 Free PMC article.
-
Integrative computational epigenomics to build data-driven gene regulation hypotheses.Gigascience. 2020 Jun 1;9(6):giaa064. doi: 10.1093/gigascience/giaa064. Gigascience. 2020. PMID: 32543653 Free PMC article.
-
PARE: A framework for removal of confounding effects from any distance-based dimension reduction method.PLoS Comput Biol. 2024 Jul 10;20(7):e1012241. doi: 10.1371/journal.pcbi.1012241. eCollection 2024 Jul. PLoS Comput Biol. 2024. PMID: 38985831 Free PMC article.
-
Integration and transfer learning of single-cell transcriptomes via cFIT.Proc Natl Acad Sci U S A. 2021 Mar 9;118(10):e2024383118. doi: 10.1073/pnas.2024383118. Proc Natl Acad Sci U S A. 2021. PMID: 33658382 Free PMC article.
References
-
- Jaitin Diego Adhemar, Kenigsberg Ephraim, Keren-Shaul Hadas, Elefant Naama, Paul Franziska, Zaretsky Irina, Mildner Alexander, Cohen Nadav, Jung Steffen, Tanay Amos, et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science. 2014;343(6172):776–779. - PMC - PubMed
-
- Macosko Evan Z, Basu Anindita, Satija Rahul, Nemesh James, Shekhar Karthik, Goldman Melissa, Tirosh Itay, Bialas Allison R, Kamitaki Nolan, Martersteck Emily M, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161(5):1202–1214. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
