Effects of Sample Size on Plant Single-Cell RNA Profiling

Curr Issues Mol Biol. 2021 Oct 20;43(3):1685-1697. doi: 10.3390/cimb43030119.

Abstract

Single-cell RNA (scRNA) profiling or scRNA-sequencing (scRNA-seq) makes it possible to parallelly investigate diverse molecular features of multiple types of cells in a given plant tissue and discover cell developmental processes. In this study, we evaluated the effects of sample size (i.e., cell number) on the outcome of single-cell transcriptome analysis by sampling different numbers of cells from a pool of ~57,000 Arabidopsis thaliana root cells integrated from five published studies. Our results indicated that the most significant principal components could be achieved when 20,000-30,000 cells were sampled, a relatively high reliability of cell clustering could be achieved by using ~20,000 cells with little further improvement by using more cells, 96% of the differentially expressed genes could be successfully identified with no more than 20,000 cells, and a relatively stable pseudotime could be estimated in the subsample with 5000 cells. Finally, our results provide a general guide for optimizing sample size to be used in plant scRNA-seq studies.

Keywords: Arabidopsis thaliana; cell number; sampling coverage; single-cell RNA (scRNA).

MeSH terms

  • Arabidopsis / genetics
  • Cell Count
  • Cluster Analysis
  • Computational Biology / methods
  • Gene Expression Profiling*
  • High-Throughput Nucleotide Sequencing
  • Organ Specificity / genetics
  • Plants / genetics
  • RNA, Plant*
  • Sequence Analysis, RNA
  • Single-Cell Analysis* / methods
  • Single-Cell Analysis* / standards
  • Transcriptome*

Substances

  • RNA, Plant