Using single nucleotide variations in single-cell RNA-seq to identify subpopulations and genotype-phenotype linkage

Nat Commun. 2018 Nov 20;9(1):4892. doi: 10.1038/s41467-018-07170-5.


Despite its popularity, characterization of subpopulations with transcript abundance is subject to a significant amount of noise. We propose to use effective and expressed nucleotide variations (eeSNVs) from scRNA-seq as alternative features for tumor subpopulation identification. We develop a linear modeling framework, SSrGE, to link eeSNVs associated with gene expression. In all the datasets tested, eeSNVs achieve better accuracies than gene expression for identifying subpopulations. Previously validated cancer-relevant genes are also highly ranked, confirming the significance of the method. Moreover, SSrGE is capable of analyzing coupled DNA-seq and RNA-seq data from the same single cells, demonstrating its value in integrating multi-omics single cell techniques. In summary, SNV features from scRNA-seq data have merits for both subpopulation identification and linkage of genotype-phenotype relationship.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Expression Profiling / methods
  • Gene Expression Regulation, Neoplastic*
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Genetic
  • Neoplasms / genetics*
  • Neoplasms / pathology
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • Single-Cell Analysis / methods*