Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 109 (4), 1187-92

Divergence of Duplicate Genes in Exon-Intron Structure


Divergence of Duplicate Genes in Exon-Intron Structure

Guixia Xu et al. Proc Natl Acad Sci U S A.


Gene duplication plays key roles in organismal evolution. Duplicate genes, if they survive, tend to diverge in regulatory and coding regions. Divergences in coding regions, especially those that can change the function of the gene, can be caused by amino acid-altering substitutions and/or alterations in exon-intron structure. Much has been learned about the mode, tempo, and consequences of nucleotide substitutions, yet relatively little is known about structural divergences. In this study, by analyzing 612 pairs of sibling paralogs from seven representative gene families and 300 pairs of one-to-one orthologs from different species, we investigated the occurrence and relative importance of structural divergences during the evolution of duplicate and nonduplicate genes. We found that structural divergences have been very prevalent in duplicate genes and, in many cases, have led to the generation of functionally distinct paralogs. Comparisons of the genomic sequences of these genes further indicated that the differences in exon-intron structure were actually accomplished by three main types of mechanisms (exon/intron gain/loss, exonization/pseudoexonization, and insertion/deletion), each of which contributed differently to structural divergence. Like nucleotide substitutions, insertion/deletion and exonization/pseudoexonization occurred more or less randomly, with the number of observable mutational events per gene pair being largely proportional to evolutionary time. Notably, however, compared with paralogs with similar evolutionary times, orthologs have accumulated significantly fewer structural changes, whereas the amounts of amino acid replacements accumulated did not show clear differences. This finding suggests that structural divergences have played a more important role during the evolution of duplicate than nonduplicate genes.

Conflict of interest statement

The authors declare no conflict of interest.


Fig. 1.
Fig. 1.
Prevalence, consequences, and the underlying mechanisms for structural divergences. (A) Stacked bar charts showing the numbers and proportions of sibling paralogs that have diverged in exon–intron structure. Red boxes represent the gene pairs in which sibling paralogs possess different numbers of exons; blue boxes stand for those that have the same numbers of exons but have experienced insertion/deletion and/or exonization/pseudoexonization events. (B) Stacked bar charts showing the numbers and proportions of structurally diverged sibling paralogs that code for proteins with distinct domain organizations and/or sequence features. Blue boxes represent those that have different numbers or types of domains; green boxes represent those that have identical numbers and types of domains but show clear differences in sequence lengths; orange boxes represent those that are indistinguishable in domain organization or sequence length but possess relatively long, unalignable regions; and pink boxes represent those that do not show clear difference in protein sequences. (C) Venn diagrams depicting the numbers of sibling paralogs that have experienced insertion/deletion (purple), exonization/pseudoexonization (gray), and exon/intron gain/loss (yellow) events. For details, see Fig. S2.
Fig. 2.
Fig. 2.
The exon–intron structures of six pairs of representative sibling paralogs and the domain organization of their proteins, showing the three types of underlying mechanisms for structural divergences. Exons that have experienced exon/intron gain/loss (AC), exonization/pseudoexonization (BF), and insertion/deletion (B and C) events are highlighted with pink; those without structural difference are in gray. Small white bars in B and C depict the indels that have resulted from insertion/deletion events.
Fig. 3.
Fig. 3.
Proportions of paralogous (A) and orthologous (BD) gene pairs that have experienced insertion/deletion and exonization/pseudoexonization event(s). For simplicity, proportions of synonymous changes (PS) are used to roughly measure the evolutionary times that have elapsed since the divergence of paralogous or orthologous genes. Gray bars show the proportions of amino acid replacements (dA) between genes.

Similar articles

See all similar articles

Cited by 135 articles

See all "Cited by" articles

Publication types

LinkOut - more resources