Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

PeerJ. 2013 Dec 19;1:e229. doi: 10.7717/peerj.229.


Motivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from

Keywords: Functional analysis; High-throughput data; Over-representation analysis; Pathway analysis; Shared components; Systems biology.

Grant support

The project was primarily supported by Teagasc RMIS6018 and the Teagasc Walsh Fellowship scheme, with some support by AllerGen 12B&B2 and Genome Canada. FSLB was a Michael Smith Foundation for Health Research Senior Scholar. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.