CAISC: A software to integrate copy number variations and single nucleotide mutations for genetic heterogeneity profiling and subclone detection by single-cell RNA sequencing

BMC Bioinformatics. 2022 Mar 21;23(Suppl 3):98. doi: 10.1186/s12859-022-04625-x.

Abstract

Background: Although both copy number variations (CNVs) and single nucleotide variations (SNVs) detected by single-cell RNA sequencing (scRNA-seq) are used to study intratumor heterogeneity and detect clonal groups, a software that integrates these two types of data in the same cells is unavailable.

Results: We developed Clonal Architecture with Integration of SNV and CNV (CAISC), an R package for scRNA-seq data analysis that clusters single cells into distinct subclones by integrating CNV and SNV genotype matrices using an entropy weighted approach. The performance of CAISC was tested on simulation data and four real datasets, which confirmed its high accuracy in sub-clonal identification and assignment, including subclones which cannot be identified using one type of data alone. Furthermore, integration of SNV and CNV allowed for accurate examination of expression changes between subclones, as demonstrated by the results from trisomy 8 clones of the myelodysplastic syndromes (MDS) dataset.

Conclusions: CAISC is a powerful tool for integration of CNV and SNV data from scRNA-seq to identify clonal clusters with better accuracy than obtained from a single type of data. CAISC allows users to interactively examine clonal assignments.

Keywords: Copy number variation; Entropy-based weighted integration; Single nucleotide variation; Single-cell RNA sequencing.

MeSH terms

  • DNA Copy Number Variations*
  • Genetic Heterogeneity
  • Mutation
  • Nucleotides*
  • Sequence Analysis, RNA / methods
  • Software

Substances

  • Nucleotides