Internal and external normalization of nascent RNA sequencing run-on experiments

Zachary L Maas; Robin D Dowell

doi:10.1186/s12859-023-05607-3

Internal and external normalization of nascent RNA sequencing run-on experiments

BMC Bioinformatics. 2024 Jan 12;25(1):19. doi: 10.1186/s12859-023-05607-3.

Authors

Zachary L Maas^{1

2}, Robin D Dowell^{3

4

5}

Affiliations

¹ Department of Computer Science, University of Colorado, Boulder, USA.
² BioFrontiers Institute, University of Colorado, Boulder, USA.
³ Department of Computer Science, University of Colorado, Boulder, USA. robin.dowell@colorado.edu.
⁴ BioFrontiers Institute, University of Colorado, Boulder, USA. robin.dowell@colorado.edu.
⁵ Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, USA. robin.dowell@colorado.edu.

Abstract

In experiments with significant perturbations to transcription, nascent RNA sequencing protocols are dependent on external spike-ins for reliable normalization. Unlike in RNA-seq, these spike-ins are not standardized and, in many cases, depend on a run-on reaction that is assumed to have constant efficiency across samples. To assess the validity of this assumption, we analyze a large number of published nascent RNA spike-ins to quantify their variability across existing normalization methods. Furthermore, we develop a new biologically-informed Bayesian model to estimate the error in spike-in based normalization estimates, which we term Virtual Spike-In (VSI). We apply this method both to published external spike-ins as well as using reads at the [Formula: see text] end of long genes, building on prior work from Mahat (Mol Cell 62(1):63-78, 2016. https://doi.org/10.1016/j.molcel.2016.02.025 ) and Vihervaara (Nat Commun 8(1):255, 2017. https://doi.org/10.1038/s41467-017-00151-0 ). We find that spike-ins in existing nascent RNA experiments are typically under sequenced, with high variability between samples. Furthermore, we show that these high variability estimates can have significant downstream effects on analysis, complicating biological interpretations of results.

Keywords: Bayesian; Nascent RNA sequencing; Normalization.

MeSH terms

Bayes Theorem
RNA* / genetics
RNA-Seq
Sequence Analysis, RNA

Substances

RNA

Abstract

MeSH terms

Substances

Grants and funding