Motivation: Single-cell bisulfite sequencing (BS-seq) techniques have been developed for DNA methylation heterogeneity detection and studies with limited materials. However, the data deficiency such as low read mapping ratio is still a critical issue.
Results: We comprehensively characterize single-cell BS-seq data and reveal chimerical molecules to be the major source of alignment failures. These chimerical molecules are produced by recombination of genomic proximal sequences with microhomology regions (MR) after bisulfite conversion. In addition, we find DNA methylation within MR is highly variable, suggesting the necessity of removing these regions to accurately estimate DNA methylation levels. We further develop scBS-map to perform quality control and local alignment of bisulfite sequencing data, chimerical molecule determination and MR removal. Using scBS-map, we show remarkable increases in uniquely mapped reads, genomic coverage and number of CpG sites, and recover more functional elements with precise DNA methylation estimation.
Availability and implementation: The scBS-map software is freely available at https://github.com/wupengomics/scBS-map.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.