Developmentally programmed genome rearrangements are rare in vertebrates, but have been reported in scattered lineages including the bandicoot, hagfish, lamprey, and zebra finch (Taeniopygia guttata) . In the finch, a well-studied animal model for neuroendocrinology and vocal learning , one such programmed genome rearrangement involves a germline-restricted chromosome, or GRC, which is found in germlines of both sexes but eliminated from mature sperm [3, 4]. Transmitted only through the oocyte, it displays uniparental female-driven inheritance, and early in embryonic development is apparently eliminated from all somatic tissue in both sexes [3, 4]. The GRC comprises the longest finch chromosome at over 120 million base pairs , and previously the only known GRC-derived sequence was repetitive and non-coding . Because the zebra finch genome project was sourced from male muscle (somatic) tissue , the remaining genomic sequence and protein-coding content of the GRC remain unknown. Here we report the first protein-coding gene from the GRC: a member of the α-soluble N-ethylmaleimide sensitive fusion protein (NSF) attachment protein (α-SNAP) family hitherto missing from zebra finch gene annotations. In addition to the GRC-encoded α-SNAP, we find an additional paralogous α-SNAP residing in the somatic genome (a somatolog)-making the zebra finch the first example in which α-SNAP is not a single-copy gene. We show divergent, sex-biased expression for the paralogs and also that positive selection is detectable across the bird α-SNAP lineage, including the GRC-encoded α-SNAP. This study presents the identification and evolutionary characterization of the first protein-coding GRC gene in any organism.
Keywords: Germline-restricted chromosome; RNA-seq; genomics; next-generation sequencing and assembly; phylogenomics; soluble NSF attachment; zebra finch.
Copyright © 2018 Elsevier Ltd. All rights reserved.