Phylogeny of NF-YA trans-activation splicing isoforms in vertebrate evolution

Genomics. 2022 May 16;114(4):110390. doi: 10.1016/j.ygeno.2022.110390. Online ahead of print.


NF-Y is a trimeric pioneer Transcription Factor (TF) whose target sequence -the CCAAT box- is present in ~25% of mammalian promoters. We reconstruct the phylogenetic history of the regulatory NF-YA subunit in vertebrates. We find that in addition to the remarkable conservation of the subunits-interaction and DNA-binding parts, the Transcriptional Activation Domain (TAD) is also conserved (>90% identity among bony vertebrates). We infer the phylogeny of the alternatively spliced exon-3 and partial splicing events of exon-7 -7N and 7C- revealing independent clade-specific losses of these regions. These isoforms shape the TAD. Absence of exon-3 in basal deuterostomes, cartilaginous fishes and hagfish, but not in lampreys, suggests that the "short" isoform is primordial, with emergence of exon-3 in chordates. Exon 7N was present in the vertebrate common ancestor, while 7C is a molecular innovation of teleost fishes. RNA-seq analysis in several species confirms expression of all these isoforms. We identify 3 blocks of amino acids in the TAD shared across deuterostomes, yet structural predictions and sequence analyses suggest an evolutionary drive for maintenance of an Intrinsically Disordered Region -IDR- within the TAD. Overall, these data help reconstruct the logic for alternative splicing of this essential eukaryotic TF.

Keywords: Alternative splicing; Evolution; Glutamine-rich; Intrinsically disordered region; NFYA; Transactivation domain; Transcription factor.