BMC Bioinformatics | |
A strategy for evaluating pathway analysis methods | |
Methodology Article | |
Anders Wallqvist1  Chenggang Yu1  Jaques Reifman1  Hyung Jun Woo1  Xueping Yu1  Tatsuya Oyama1  | |
[1] Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, 21702, Fort Detrick, MD, USA; | |
关键词: Gene set enrichment analysis; Pathway analysis; Method evaluation; | |
DOI : 10.1186/s12859-017-1866-7 | |
received in 2017-06-08, accepted in 2017-10-09, 发布年份 2017 | |
来源: Springer | |
【 摘 要 】
BackgroundResearchers have previously developed a multitude of methods designed to identify biological pathways associated with specific clinical or experimental conditions of interest, with the aim of facilitating biological interpretation of high-throughput data. Before practically applying such pathway analysis (PA) methods, we must first evaluate their performance and reliability, using datasets where the pathways perturbed by the conditions of interest have been well characterized in advance. However, such ‘ground truths’ (or gold standards) are often unavailable. Furthermore, previous evaluation strategies that have focused on defining ‘true answers’ are unable to systematically and objectively assess PA methods under a wide range of conditions.ResultsIn this work, we propose a novel strategy for evaluating PA methods independently of any gold standard, either established or assumed. The strategy involves the use of two mutually complementary metrics, recall and discrimination. Recall measures the consistency of the perturbed pathways identified by applying a particular analysis method to an original large dataset and those identified by the same method to a sub-dataset of the original dataset. In contrast, discrimination measures specificity—the degree to which the perturbed pathways identified by a particular method to a dataset from one experiment differ from those identifying by the same method to a dataset from a different experiment. We used these metrics and 24 datasets to evaluate six widely used PA methods. The results highlighted the common challenge in reliably identifying significant pathways from small datasets. Importantly, we confirmed the effectiveness of our proposed dual-metric strategy by showing that previous comparative studies corroborate the performance evaluations of the six methods obtained by our strategy.ConclusionsUnlike any previously proposed strategy for evaluating the performance of PA methods, our dual-metric strategy does not rely on any ground truth, either established or assumed, of the pathways perturbed by a specific clinical or experimental condition. As such, our strategy allows researchers to systematically and objectively evaluate pathway analysis methods by employing any number of datasets for a variety of conditions.
【 授权许可】
CC BY
© The Author(s). 2017
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311102764267ZK.pdf | 1131KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]