Longer first introns are a general property of eukaryotic gene structure

PLoS One. 2008 Aug 29;3(8):e3093. doi: 10.1371/journal.pone.0003093.


While many properties of eukaryotic gene structure are well characterized, differences in the form and function of introns that occur at different positions within a transcript are less well understood. In particular, the dynamics of intron length variation with respect to intron position has received relatively little attention. This study analyzes all available data on intron lengths in GenBank and finds a significant trend of increased length in first introns throughout a wide range of species. This trend was found to be even stronger when using high-confidence gene annotation data for three model organisms (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) which show that the first intron in the 5' UTR is--on average--significantly longer than all downstream introns within a gene. A partial explanation for increased first intron length in A. thaliana is suggested by the increased frequency of certain motifs that are present in first introns. The phenomenon of longer first introns can potentially be used to improve gene prediction software and also to detect errors in existing gene annotations.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Arabidopsis / genetics*
  • Caenorhabditis elegans / genetics*
  • Databases, Nucleic Acid / organization & administration
  • Databases, Nucleic Acid / standards
  • Drosophila melanogaster / genetics*
  • Genes*
  • Genes, Helminth*
  • Genes, Insect*
  • Genes, Plant / genetics
  • Introns / genetics*
  • Reproducibility of Results