Controlled vocabularies and ontologies in proteomics: overview, principles and practice

Biochim Biophys Acta. 2014 Jan;1844(1 Pt A):98-107. doi: 10.1016/j.bbapap.2013.02.017. Epub 2013 Feb 19.


This paper focuses on the use of controlled vocabularies (CVs) and ontologies especially in the area of proteomics, primarily related to the work of the Proteomics Standards Initiative (PSI). It describes the relevant proteomics standard formats and the ontologies used within them. Software and tools for working with these ontology files are also discussed. The article also examines the "mapping files" used to ensure correct controlled vocabulary terms that are placed within PSI standards and the fulfillment of the MIAPE (Minimum Information about a Proteomics Experiment) requirements. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.

Keywords: ANDI-MS; API; ASCII; ASTM; American Society for Testing and Materials; American Standard Code for Information Interchange; Analytical Data Interchange format for Mass Spectrometry; Analytical Information Markup Language; AniML; Application Programming Interface; BRENDA (BRaunschweig ENzyme DAtabase) Tissue Ontology; BTO; CV; ChEBI; Chemical Entities of Biological Interest; Controlled Vocabulary; Controlled vocabularies; DL; Description Logic; EBI; European Bioinformatics Institute; HDF5; HUPO-PSI; Hierarchical Data Format, version 5; Human Proteome Organisation-Proteomics Standards Initiative; ICD; IUPAC; International Classification of Diseases; International Union for Pure and Applied Chemistry; JCAMP-DX; Joint Committee on Atomic and Molecular Physical data-Data eXchange format; MALDI; MI; MIAPE; MIBBI; MITAB; MS; Mass Spectrometry; Matrix Assisted Laser Desorption Ionization; MeSH; Medical Subject Headings; Minimal Information for Biological and Biomedical Investigations; Minimum Information About a Proteomics Experiment; Molecular Interaction; Molecular Interactions TABular format; NCBI; NCBO; National Center for Biomedical Ontology; National Center for Biotechnology Information; Network Common Data Format; OBI; OBO; OLS; OWL; Ontologies in proteomics; Ontology Lookup Service; Ontology editors and software; Ontology for Biomedical Investigations; Ontology formats; Ontology maintenance; Open Biological and Biomedical Ontologies; PAR; PATO; PRIDE; PRoteomics IDEntifications database; Phenotype Attribute Trait Ontology; Protein Affinity Reagents; Proteomics data standards; RDF(S); Resource Description Framework (Schema); SRM; Selected Reaction Monitoring; TPP; Trans-Proteomic Pipeline; URI; Uniform Resource Identifier; Web Ontology Language; XSLT; YAFMS; Yet Another Format for Mass Spectrometry; eXtensible Stylesheet Language Transformation; netCDF.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Programming Languages
  • Proteomics*
  • Software
  • Vocabulary, Controlled*