An extended IUPAC nomenclature code for polymorphic nucleic acids

Bioinformatics. 2010 May 15;26(10):1386-9. doi: 10.1093/bioinformatics/btq098. Epub 2010 Mar 3.


The International Union of Pure and Applied Chemistry (IUPAC) code specified nearly 25 years ago provides a nomenclature for incompletely specified nucleic acids. However, no system currently exists that allows for the informatics representation of the relative abundance at polymorphic nucleic acids (e.g. single nucleotide polymorphisms) in a single specified character, or a string of characters. Here, I propose such an information code as a natural extension to the IUPAC nomenclature code, and present some potential uses and limitations to such a code. The primary anticipated use of this extended nomenclature code is to assist in the representation of the rapidly growing space of information in human genetic variation.


Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Letter
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computer Simulation
  • Nucleic Acids / chemistry
  • Nucleic Acids / classification*
  • Terminology as Topic


  • Nucleic Acids