Semantic particularity measure for functional characterization of gene sets using gene ontology

PLoS One. 2014 Jan 28;9(1):e86525. doi: 10.1371/journal.pone.0086525. eCollection 2014.

Abstract

Background: Genetic and genomic data analyses are outputting large sets of genes. Functional comparison of these gene sets is a key part of the analysis, as it identifies their shared functions, and the functions that distinguish each set. The Gene Ontology (GO) initiative provides a unified reference for analyzing the genes molecular functions, biological processes and cellular components. Numerous semantic similarity measures have been developed to systematically quantify the weight of the GO terms shared by two genes. We studied how gene set comparisons can be improved by considering gene set particularity in addition to gene set similarity.

Results: We propose a new approach to compute gene set particularities based on the information conveyed by GO terms. A GO term informativeness can be computed using either its information content based on the term frequency in a corpus, or a function of the term's distance to the root. We defined the semantic particularity of a set of GO terms Sg1 compared to another set of GO terms Sg2. We combined our particularity measure with a similarity measure to compare gene sets. We demonstrated that the combination of semantic similarity and semantic particularity measures was able to identify genes with particular functions from among similar genes. This differentiation was not recognized using only a semantic similarity measure.

Conclusion: Semantic particularity should be used in conjunction with semantic similarity to perform functional analysis of GO-annotated gene sets. The principle is generalizable to other ontologies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Aquaporins / metabolism
  • Biological Transport
  • Databases, Genetic*
  • Gene Ontology*
  • Genes*
  • Genes, Fungal
  • Humans
  • Karyopherins / genetics
  • Rats
  • Saccharomyces cerevisiae / genetics
  • Semantics*
  • Sequence Homology, Nucleic Acid
  • Tryptophan / metabolism

Substances

  • Aquaporins
  • Karyopherins
  • Tryptophan

Grants and funding

CB was supported by a fellowship from the French ministry of research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.