The International Union of Pure and Applied Chemistry (IUPAC) code specified nearly 25 years ago provides a nomenclature for incompletely specified nucleic acids. However, no system currently exists that allows for the informatics representation of the relative abundance at polymorphic nucleic acids (e.g. single nucleotide polymorphisms) in a single specified character, or a string of characters. Here, I propose such an information code as a natural extension to the IUPAC nomenclature code, and present some potential uses and limitations to such a code. The primary anticipated use of this extended nomenclature code is to assist in the representation of the rapidly growing space of information in human genetic variation.
Supplementary information: Supplementary data are available at Bioinformatics online.