Delineating yeast cleavage and polyadenylation signals using deep learning

bioRxiv [Preprint]. 2023 Oct 13:2023.10.10.561764. doi: 10.1101/2023.10.10.561764.

Abstract

3'-end cleavage and polyadenylation is an essential process for eukaryotic mRNA maturation. In yeast species, the polyadenylation signals that recruit the processing machinery are degenerate and remain poorly characterized compared to well-defined regulatory elements in mammals. Especially, recent deep sequencing experiments showed extensive cleavage heterogeneity for some mRNAs in Saccharomyces cerevisiae and uncovered the polyA motif differences between S. cerevisiae vs. Schizosaccharomyces pombe . The findings raised the fundamental question of how polyadenylation signals are formed in yeast. Here we addressed this question by developing deep learning models to deconvolute degenerate cis -regulatory elements and quantify their positional importance in mediating yeast polyA site formation, cleavage heterogeneity, and strength. In S. cerevisiae , cleavage heterogeneity is promoted by the depletion of U-rich elements around polyA sites as well as multiple occurrences of upstream UA-rich elements. Sites with high cleavage heterogeneity show overall lower strength. The site strength and tandem site distances modulate alternative polyadenylation (APA) under the diauxic stress. Finally, we developed a deep learning model to reveal the distinct motif configuration of S. pombe polyA sites which show more precise cleavage than S. cerevisiae . Altogether, our deep learning models provide unprecedented insights into polyA site formation across yeast species.

Publication types

  • Preprint