Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments
- PMID: 23267174
- PMCID: PMC3570210
- DOI: 10.1093/bioinformatics/bts714
Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments
Abstract
Motivation: Cell populations are never truly homogeneous; individual cells exist in biochemical states that define functional differences between them. New technology based on microfluidic arrays combined with multiplexed quantitative polymerase chain reactions now enables high-throughput single-cell gene expression measurement, allowing assessment of cellular heterogeneity. However, few analytic tools have been developed specifically for the statistical and analytical challenges of single-cell quantitative polymerase chain reactions data.
Results: We present a statistical framework for the exploration, quality control and analysis of single-cell gene expression data from microfluidic arrays. We assess accuracy and within-sample heterogeneity of single-cell expression and develop quality control criteria to filter unreliable cell measurements. We propose a statistical model accounting for the fact that genes at the single-cell level can be on (and a continuous expression measure is recorded) or dichotomously off (and the recorded expression is zero). Based on this model, we derive a combined likelihood ratio test for differential expression that incorporates both the discrete and continuous components. Using an experiment that examines treatment-specific changes in expression, we show that this combined test is more powerful than either the continuous or dichotomous component in isolation, or a t-test on the zero-inflated data. Although developed for measurements from a specific platform (Fluidigm), these tools are generalizable to other multi-parametric measures over large numbers of events.
Availability: All results presented here were obtained using the SingleCellAssay R package available on GitHub (http://github.com/RGLab/SingleCellAssay).
Figures
for single-cell (left, light gray) and 100-cell experiments (right, dark gray). Genes FASLG, IFN-
, BIRC3 and CD69 are depicted. The frequency expression of each gene in the single-cell experiments
is printed above each histogram. The mean of the 100-cell and single-cell experiments is indicated by a thick black line along the x-axis
and
, the in silico average of single-cell wells for datasets A, B and C. In the top row, wells with
are included and treated as exact zeroes. In the middle row, they are excluded, resulting in a clear lack of concordance. In the final row, wells are filtered as per Section 2.3. Dark, thin lines show the initial location of a gene before filtering and connect to the location of the gene after filtering. In each panel,
, the concordance correlation coefficient and
, the average weighted squared deviation of expression measurements is printed. The dotted black line shows a loess fit through the data. In all cases, the expression values are transformed using a shifted log-transformation [
]. As such, a graphed value of zero corresponds to a zero expression value (i.e.
)
units) versus FDR, by treatment, dataset A. The combined LRT is compared with a Bernoulli or normal-theory only LRT, as well as a t-test of the raw expression values (
scale), including zero measurements
of tests (genes
units) versus frequencies of expression
of the genes. The Bernoulli, normal-theory and combined LRTs are plotted. Asterisk indicates test is different from the combined test at 5% significance in a Wilcoxon signed-rank test
for selected genes (rows, see main text) and all 16 individuals (columns). The color above each column indicates the antigen stimulation applied to the cells; thus, individuals are randomly arranged in each antigen block. Red and purple are two different CMV antigen pools; yellow and orange are two different HIV antigen poolsSimilar articles
-
Microdroplet-based one-step RT-PCR for ultrahigh throughput single-cell multiplex gene expression analysis and rare cell detection.Sci Rep. 2021 Mar 24;11(1):6777. doi: 10.1038/s41598-021-86087-4. Sci Rep. 2021. PMID: 33762663 Free PMC article.
-
Quantitative miRNA expression analysis using fluidigm microfluidics dynamic arrays.BMC Genomics. 2011 Mar 9;12:144. doi: 10.1186/1471-2164-12-144. BMC Genomics. 2011. PMID: 21388556 Free PMC article.
-
Validation of oligonucleotide microarray data using microfluidic low-density arrays: a new statistical method to normalize real-time RT-PCR data.Biotechniques. 2005 May;38(5):785-92. doi: 10.2144/05385MT01. Biotechniques. 2005. PMID: 15945375
-
Twenty-five years of quantitative PCR for gene expression analysis.Biotechniques. 2008 Apr;44(5):619-26. doi: 10.2144/000112776. Biotechniques. 2008. PMID: 18474036 Review.
-
Microfluidic single cell analysis: from promise to practice.Curr Opin Chem Biol. 2012 Aug;16(3-4):381-90. doi: 10.1016/j.cbpa.2012.03.022. Epub 2012 Apr 21. Curr Opin Chem Biol. 2012. PMID: 22525493 Review.
Cited by
-
Single-Cell Transcriptomics: Current Methods and Challenges in Data Acquisition and Analysis.Front Neurosci. 2021 Apr 22;15:591122. doi: 10.3389/fnins.2021.591122. eCollection 2021. Front Neurosci. 2021. PMID: 33967674 Free PMC article. Review.
-
Single-cell sequencing unveils key contributions of immune cell populations in cancer-associated adipose wasting.Cell Discov. 2022 Nov 15;8(1):122. doi: 10.1038/s41421-022-00466-3. Cell Discov. 2022. PMID: 36376273 Free PMC article.
-
Single-cell mapper (scMappR): using scRNA-seq to infer the cell-type specificities of differentially expressed genes.NAR Genom Bioinform. 2021 Feb 23;3(1):lqab011. doi: 10.1093/nargab/lqab011. eCollection 2021 Mar. NAR Genom Bioinform. 2021. PMID: 33655208 Free PMC article.
-
Single-Cell Transcriptome Analysis Reveals Six Subpopulations Reflecting Distinct Cellular Fates in Senescent Mouse Embryonic Fibroblasts.Front Genet. 2020 Aug 11;11:867. doi: 10.3389/fgene.2020.00867. eCollection 2020. Front Genet. 2020. PMID: 32849838 Free PMC article.
-
Bioinformatic and mouse model reveal the potential high vulnerability of Leydig cells on SARS-CoV-2.Ann Transl Med. 2021 Apr;9(8):678. doi: 10.21037/atm-21-936. Ann Transl Med. 2021. PMID: 33987376 Free PMC article.
References
-
- Ge Y, et al. Resampling-based multiple testing for microarray data analysis. TEST. 2003;12:1–77.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
