Complex selection on 5' splice sites in intron-rich organisms

Genome Res. 2009 Nov;19(11):2021-7. doi: 10.1101/gr.089276.108. Epub 2009 Sep 10.


In contrast to the typically streamlined genomes of prokaryotes, many eukaryotic genomes are riddled with long intergenic regions, spliceosomal introns, and repetitive elements. What explains the persistence of these and other seemingly suboptimal structures? There are three general hypotheses: (1) the structures in question are not actually suboptimal but optimal, being favored by selection, for unknown reasons; (2) the structures are not suboptimal, but of (essentially) equal fitness to "optimal" ones; or (3) the structures are truly suboptimal, but selection is too weak to systematically eliminate them. The 5' splice sites of introns offer a rare opportunity to directly test these hypotheses. Intron-poor species show a clear consensus splice site; most introns begin with the same six nucleotide sequence (typically GTAAGT or GTATGT), indicating efficient selection for this consensus sequence. In contrast, intron-rich species have much less pronounced boundary consensus sequences, and only small minorities of introns in intron-rich species share the same boundary sequence. We studied rates of evolutionary change of 5' splice sites in three groups of closely related intron-rich species--three primates, five Drosophila species, and four Cryptococcus fungi. Surprisingly, the results indicate that changes from consensus-to-variant nucleotides are generally disfavored by selection, but that changes from variant to consensus are neither favored nor disfavored. This evolutionary pattern is consistent with selective differences across introns, for instance, due to compensatory changes at other sites within the gene, which compensate for the otherwise suboptimal consensus-to-variant changes in splice boundaries.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Consensus Sequence / genetics
  • Cryptococcus / classification
  • Cryptococcus / genetics
  • Cryptococcus neoformans / genetics
  • Drosophila / classification
  • Drosophila / genetics
  • Drosophila melanogaster / genetics
  • Evolution, Molecular
  • Genetic Variation*
  • Introns / genetics*
  • Macaca / genetics
  • Pan troglodytes / genetics
  • Phylogeny
  • Primates / classification
  • Primates / genetics
  • RNA Splice Sites / genetics*
  • Selection, Genetic*
  • Species Specificity


  • RNA Splice Sites