Integrated pathway clusters with coherent biological themes for target prioritisation

PLoS One. 2014 Jun 11;9(6):e99030. doi: 10.1371/journal.pone.0099030. eCollection 2014.


Prioritising candidate genes for further experimental characterisation is an essential, yet challenging task in biomedical research. One way of achieving this goal is to identify specific biological themes that are enriched within the gene set of interest to obtain insights into the biological phenomena under study. Biological pathway data have been particularly useful in identifying functional associations of genes and/or gene sets. However, biological pathway information as compiled in varied repositories often differs in scope and content, preventing a more effective and comprehensive characterisation of gene sets. Here we describe a new approach to constructing biologically coherent gene sets from pathway data in major public repositories and employing them for functional analysis of large gene sets. We first revealed significant overlaps in gene content between different pathways and then defined a clustering method based on the shared gene content and the similarity of gene overlap patterns. We established the biological relevance of the constructed pathway clusters using independent quantitative measures and we finally demonstrated the effectiveness of the constructed pathway clusters in comparative functional enrichment analysis of gene sets associated with diverse human diseases gathered from the literature. The pathway clusters and gene mappings have been integrated into the TargetMine data warehouse and are likely to provide a concise, manageable and biologically relevant means of functional analysis of gene sets and to facilitate candidate gene prioritisation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Biomedical Research*
  • Cluster Analysis

Grants and funding

This work was in part supported by the Industrial Technology Research Grant Program in 2007 (Grant Number 07C46056a) from New Energy and Industrial Technology Development Organization (NEDO) of Japan, and also by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science, and Technology (Grant Numbers 25430186, 25293079) and from the Ministry of Health, Labor, and Welfare (“The Adjuvant database project”) to K.M. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.