Internal and external normalization of nascent RNA sequencing run-on experiments

BMC Bioinformatics. 2024 Jan 12;25(1):19. doi: 10.1186/s12859-023-05607-3.

Abstract

In experiments with significant perturbations to transcription, nascent RNA sequencing protocols are dependent on external spike-ins for reliable normalization. Unlike in RNA-seq, these spike-ins are not standardized and, in many cases, depend on a run-on reaction that is assumed to have constant efficiency across samples. To assess the validity of this assumption, we analyze a large number of published nascent RNA spike-ins to quantify their variability across existing normalization methods. Furthermore, we develop a new biologically-informed Bayesian model to estimate the error in spike-in based normalization estimates, which we term Virtual Spike-In (VSI). We apply this method both to published external spike-ins as well as using reads at the [Formula: see text] end of long genes, building on prior work from Mahat (Mol Cell 62(1):63-78, 2016. https://doi.org/10.1016/j.molcel.2016.02.025 ) and Vihervaara (Nat Commun 8(1):255, 2017. https://doi.org/10.1038/s41467-017-00151-0 ). We find that spike-ins in existing nascent RNA experiments are typically under sequenced, with high variability between samples. Furthermore, we show that these high variability estimates can have significant downstream effects on analysis, complicating biological interpretations of results.

Keywords: Bayesian; Nascent RNA sequencing; Normalization.

MeSH terms

  • Bayes Theorem
  • RNA* / genetics
  • RNA-Seq
  • Sequence Analysis, RNA

Substances

  • RNA