Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Oct 4:7:187.
doi: 10.1186/1471-2148-7-187.

Family expansion and gene rearrangements contributed to the functional specialization of PRDM genes in vertebrates

Affiliations

Family expansion and gene rearrangements contributed to the functional specialization of PRDM genes in vertebrates

Irene Fumasoni et al. BMC Evol Biol. .

Abstract

Background: Progressive diversification of paralogs after gene expansion is essential to increase their functional specialization. However, mode and tempo of this divergence remain mostly unclear. Here we report the comparative analysis of PRDM genes, a family of putative transcriptional regulators involved in human tumorigenesis.

Results: Our analysis assessed that the PRDM genes originated in metazoans, expanded in vertebrates and further duplicated in primates. We experimentally showed that fast-evolving paralogs are poorly expressed, and that the most recent duplicates, such as primate-specific PRDM7, acquire tissue-specificity. PRDM7 underwent major structural rearrangements that decreased the number of encoded Zn-Fingers and modified gene splicing. Through internal duplication and activation of a non-canonical splice site (GC-AG), PRDM7 can acquire a novel intron. We also detected an alternative isoform that can retain the intron in the mature transcript and that is predominantly expressed in human melanocytes.

Conclusion: Our findings show that (a) molecular evolution of paralogs correlates with their expression pattern; (b) gene diversification is obtained through massive genomic rearrangements; and (c) splicing modification contributes to the functional specialization of novel genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Domain architecture and phylogenetic distribution of the PRDM proteins. (A) Domain architecture of the human PRDM paralogs. For each PRDM protein, the corresponding RefSeq Accession Number and additional names are provided. Protein regions longer than 1000 amino acids and with no domains are shown as vertical bars (PRDM2, 1239 amino acids; PRDM15, 1507 amino acids; PRDM16, 12769 amino acids). (B) Distribution of PRDM genes in representatives of metazoans. For each species, the number of PRDM paralogs is reported in brackets. The species highlighted in cyan are vertebrates, those in yellow are invertebrates. FB, family birth; FE, family expansion; GD, gene duplication.
Figure 2
Figure 2
Phylogenetic tree and expression patterns of the PRDM genes. (A) Phylogenetic tree of the PRDM genes in vertebrates. The reported topology is obtained with Maximum Likelihood. The branches supported with a bootstrap lower than 75 are shown in grey. On main bifurcations, the corresponding posterior probability from Bayesian inference is reported (see Methods). Different colours associated to tree branches correspond to the main subfamilies. For each subfamily, the gene structure of human PRDM ortholog is depicted. The scale refers to exons only. The tree image was produced using iTOL [44]. (B) Evolutionary speed and gene expression of the human PRDM paralogs. PRDM genes are ordered by increasing evolutionary divergence, calculated as cumulative branch lengths from the tip to the root of the phylogenetic tree. The expression data were measured as the mean values of different assays for each gene (see Methods). The upper limit of the 2-ΔCt values was set to 10. For original values see Additional file 7.
Figure 3
Figure 3
Syntenic conservation, gene structure, and splicing variants of PRDM7 and PRDM9. (A) Comparison of the syntenic blocks around PRDM7 and PRDM9 in vertebrates. Each chromosome is depicted in a different colour, except for the genomic regions around the PRDM7-9 genes that are all cyan. PRDM7 and 9 are represented as grey blocks. The chromosome number in the corresponding genome is provided. Dashed lines correspond to regions of break of synteny. Abbreviations: Hs, Homo sapiens; Pt, Pan troglodytes; Mma, Macaca mulatta; Mm, Mus musculus; Rn, Rattus norvegicus; Gg, Gallus gallus. (B) Gene structure of PRDM7 and PRDM9. Since for chimp and macaque no mRNA sequences are available, the human PRDM7 and 9 were used as templates for gene predictions. In chimp, the intron putatively gained by PRDM7 is composed of eight repeats. In the genomic regions corresponding to chimp PRDM9, there are four additional Zn-Fingers, which are reported in black because there is no evidence for their transcription. The dashed lines represent regions of gaps in the genome assembly. In rodents, the last intron is longer and not in scale; the corresponding length is reported in brackets. (C) Splicing variants of human PRDM7 and PRDM9. The grey lines represent the genomic regions of segmental duplication. The corresponding chromosome number, chromosomal coordinates and direction of transcription are given. For PRDM9, the splicing variants present in the database are shown. For PRDM7, both the database transcripts and the isoforms detected in this study are reported together with an in-silico gene prediction obtained by using the PRDM9 long isoform as template.
Figure 4
Figure 4
Gene rearrangements and transcription evidence of PRDM7. (A) Effects of the internal duplication of ancestral exon 3 on PRDM7 splicing. The entire sequence of the ancestral exon 3 is reported; shown are the 89-long segment that undergoes duplication (bold) and the putative cryptic splice sites (red). The duplicon is represented as underlined text. After duplication, the non-canonical splice site (GC-AG) can be activated leading to intron splicing. The entire region can also be retained into the transcript resulting in a protein with no Zn-Fingers due to the introduction of a frameshift. The region in between the two red arrows was amplified in a variety of normal and tumoral samples, as reported in the panels (B). (B) RT-PCR analysis of exon 3 in normal tissues and cancer cell lines. The upper panel reports amplifications in normal samples, while the lower in cancer cell lines. We verified by sequence analysis that the upper band corresponds to PRDM7 exon 3 retaining the duplicated segment. The lower band can be either exon 3 of PRDM9 or exons 3-4 of PRDM7, since the two genes are indistinguishable in this region.

Similar articles

Cited by

References

    1. Rubin GM, Yandell MD, Wortman JR, Gabor Miklos GL, Nelson CR, Hariharan IK, Fortini ME, Li PW, Apweiler R, Fleischmann W, et al. Comparative Genomics of the Eukaryotes. Science. 2000;287:2204–2215. doi: 10.1126/science.287.5461.2204. - DOI - PMC - PubMed
    1. Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS, Harris MA, Dolinski K, Mohr S, Smith T, et al. Comparison of the Complete Protein Sets of Worm and Yeast: Orthology and Divergence. Science. 1998;282:2022–2028. doi: 10.1126/science.282.5396.2022. - DOI - PMC - PubMed
    1. Lespinet O, Wolf YI, Koonin EV, Aravind L. The Role of Lineage-Specific Gene Family Expansion in the Evolution of Eukaryotes. Genome Res. 2002;12:1048–1059. doi: 10.1101/gr.174302. - DOI - PMC - PubMed
    1. Vogel C, Chothia C. Protein Family Expansions and Biological Complexity. PLoS Computational Biology. 2006;2:e48. doi: 10.1371/journal.pcbi.0020048. - DOI - PMC - PubMed
    1. Consortium IHGS Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. - DOI - PubMed

Publication types