Bookend: precise transcript reconstruction with end-guided assembly

Genome Biol. 2022 Jun 29;23(1):143. doi: 10.1186/s13059-022-02700-3.

Abstract

We developed Bookend, a package for transcript assembly that incorporates data from different RNA-seq techniques, with a focus on identifying and utilizing RNA 5' and 3' ends. We demonstrate that correct identification of transcript start and end sites is essential for precise full-length transcript assembly. Utilization of end-labeled reads present in full-length single-cell RNA-seq datasets dramatically improves the precision of transcript assembly in single cells. Finally, we show that hybrid assembly across short-read, long-read, and end-capture RNA-seq datasets from Arabidopsis thaliana, as well as meta-assembly of RNA-seq from single mouse embryonic stem cells, can produce reference-quality end-to-end transcript annotations.

Keywords: 5′ and 3′ ends; Capping; Iso-Seq; Long-read; PAS; Polyadenylation; RNA-seq; Single-cell; TSS; Transcriptome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arabidopsis* / genetics
  • Mice
  • RNA* / genetics
  • RNA-Seq
  • Sequence Analysis, RNA / methods
  • Transcriptome

Substances

  • RNA