Signals for the selection of a splice site in pre-mRNA. Computer analysis of splice junction sequences and like sequences

J Mol Biol. 1987 May 20;195(2):247-59. doi: 10.1016/0022-2836(87)90647-4.


To evaluate the importance of the surrounding nucleotide sequence in the selection of a splice site for mRNA, we have carried out computer studies of eukaryotic protein genes whose entire nucleotide sequences were available. A splice site-like sequence that has a significant homology to the consensus splice junction sequences is frequently found within an intron and exon. It is found that the higher the homology of a candidate donor site sequence to the nine-nucleotide consensus sequence, the higher is its probability of being a donor site. For most of the donors, the stability of presumed base-pairing with U1-RNA is higher than that of donor-like sequences, if any, in the adjacent exon and intron. However, homology of a candidate acceptor sequence to the 15-nucleotide consensus is a poor criterion of an acceptor site. The presence of a sequence that could serve as a branch-point 18 to 37 nucleotides before an acceptor does not seem to be critical in distinguishing it from an acceptor-like sequence. For genes of human, rat, mouse and chicken, respectively, nucleotide frequencies around splice junctions of many genes have been calculated. They seem to be different at some positions around a donor site from species to species. The acceptors for these vertebrates have longer pyrimidine-rich regions than the previous consensus sequence. The newly derived nucleotide frequencies were used as the standard to calculate the weighted homology score of a candidate splice site sequence in a gene of the four species. This weighted homology score of the 40 to 60-nucleotide intron-exon sequence is a much better criterion of an acceptor. These results suggest that the most important signal in the selection of a splice resides in the surrounding nucleotide sequence. It is also suggested that the surrounding nucleotide sequence alone is not generally sufficient for the selection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Computers
  • Exons
  • Humans
  • Introns
  • Mice
  • Molecular Sequence Data
  • RNA Precursors*
  • RNA Splicing*
  • RNA, Messenger*
  • Rabbits
  • Rats
  • Sequence Homology, Nucleic Acid


  • RNA Precursors
  • RNA, Messenger