Somatic variant calling from single-cell DNA sequencing data

Comput Struct Biotechnol J. 2022 Jun 14;20:2978-2985. doi: 10.1016/j.csbj.2022.06.013. eCollection 2022.


Single-cell sequencing has gained popularity in recent years. Despite its numerous applications, single-cell DNA sequencing data is highly error-prone due to technical biases arising from uneven sequencing coverage, allelic dropout, and amplification error. With these artifacts, the identification of somatic genomic variants becomes a challenging task, and over the years, several methods have been developed explicitly for this type of data. Single-cell variant callers implement distinct strategies, make different use of the data, and typically result in many discordant calls when applied to real data. Here, we review current approaches for single-cell variant calling, emphasizing single nucleotide variants. We highlight their potential benefits and shortcomings to help users choose a suitable tool for their data at hand.

Keywords: ADO, allelic dropout; Allele dropout; Amplification error; CNV, copy number variant; Indel, short insertion or deletion; LDO, locus dropout; SNV, single nucleotide variant; SV, structural variant; Single-cell genomics; Somatic variants; VAF, variant allele frequency; Variant calling; hSNP, heterozygous single-nucleotide polymorphism; scATAC-seq, single-cell sequencing assay for transposase-accessible chromatin; scDNA-seq, single-cell DNA sequencing; scHi-C, single-cell Hi-C sequencing; scMethyl-seq, single-cell Methylation sequencing; scRNA-seq, single-cell RNA sequencing; scWGA, single-cell whole-genome amplification.

Publication types

  • Review