Using information content and base frequencies to distinguish mutations from genetic polymorphisms in splice junction recognition sites

Hum Mutat. 1995;6(1):74-6. doi: 10.1002/humu.1380060114.


Predicting the effects of nucleotide substitutions in human splice sites has been based on analysis of consensus sequences. We used a graphic representation of sequence conservation and base frequency, the sequence logo, to demonstrate that a change in a splice acceptor of hMSH2 (a gene associated with familial nonpolyposis colon cancer) probably does not reduce splicing efficiency. This confirms a population genetic study that suggested that this substitution is a genetic polymorphism. The information theory-based sequence logo is quantitative and more sensitive than the corresponding splice acceptor consensus sequence for detection of true mutations. Information analysis may potentially be used to distinguish polymorphisms from mutations in other types of transcriptional, translational, or protein-coding motifs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence*
  • Consensus Sequence
  • Humans
  • Mutation*
  • Polymorphism, Genetic*
  • RNA Splicing / genetics*