SCAFE: a software suite for analysis of transcribed cis-regulatory elements in single cells

Bioinformatics. 2022 Nov 15;38(22):5126-5128. doi: 10.1093/bioinformatics/btac644.


Motivation: Cell type-specific activities of cis-regulatory elements (CRE) are central to understanding gene regulation and disease predisposition. Single-cell RNA 5'end sequencing (sc-end5-seq) captures the transcription start sites (TSS) which can be used as a proxy to measure the activity of transcribed CREs (tCREs). However, a substantial fraction of TSS identified from sc-end5-seq data may not be genuine due to various artifacts, hindering the use of sc-end5-seq for de novo discovery of tCREs.

Results: We developed SCAFE-Single-Cell Analysis of Five-prime Ends-a software suite that processes sc-end5-seq data to de novo identify TSS clusters based on multiple logistic regression. It annotates tCREs based on the identified TSS clusters and generates a tCRE-by-cell count matrix for downstream analyses. The software suite consists of a set of flexible tools that could either be run independently or as pre-configured workflows.

Availability and implementation: SCAFE is implemented in Perl and R. The source code and documentation are freely available for download under the MIT License from Docker images are available from The submitted software version and test data are archived at and, respectively.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Regulatory Sequences, Nucleic Acid*
  • Software*
  • Transcription Initiation Site
  • Workflow