Background: Researchers have previously developed a multitude of methods designed to identify biological pathways associated with specific clinical or experimental conditions of interest, with the aim of facilitating biological interpretation of high-throughput data. Before practically applying such pathway analysis (PA) methods, we must first evaluate their performance and reliability, using datasets where the pathways perturbed by the conditions of interest have been well characterized in advance. However, such 'ground truths' (or gold standards) are often unavailable. Furthermore, previous evaluation strategies that have focused on defining 'true answers' are unable to systematically and objectively assess PA methods under a wide range of conditions.
Results: In this work, we propose a novel strategy for evaluating PA methods independently of any gold standard, either established or assumed. The strategy involves the use of two mutually complementary metrics, recall and discrimination. Recall measures the consistency of the perturbed pathways identified by applying a particular analysis method to an original large dataset and those identified by the same method to a sub-dataset of the original dataset. In contrast, discrimination measures specificity-the degree to which the perturbed pathways identified by a particular method to a dataset from one experiment differ from those identifying by the same method to a dataset from a different experiment. We used these metrics and 24 datasets to evaluate six widely used PA methods. The results highlighted the common challenge in reliably identifying significant pathways from small datasets. Importantly, we confirmed the effectiveness of our proposed dual-metric strategy by showing that previous comparative studies corroborate the performance evaluations of the six methods obtained by our strategy.
Conclusions: Unlike any previously proposed strategy for evaluating the performance of PA methods, our dual-metric strategy does not rely on any ground truth, either established or assumed, of the pathways perturbed by a specific clinical or experimental condition. As such, our strategy allows researchers to systematically and objectively evaluate pathway analysis methods by employing any number of datasets for a variety of conditions.
Keywords: Gene set enrichment analysis; Method evaluation; Pathway analysis.