期刊论文详细信息
BMC Bioinformatics
A robust approach to optimizing multi-source information for enhancing genomics retrieval performance
Proceedings
Jun Miao1  Qinmin Hu1  Jimmy Xiangji Huang2 
[1] Information Retrieval and Knowledge Management Research Lab, York University, M3J1P3, Toronto, ON, Canada;Department of Computer Science & Engineering, York University, M3J1P3, Toronto, ON, Canada;Information Retrieval and Knowledge Management Research Lab, York University, M3J1P3, Toronto, ON, Canada;School of Information Technology, York University, M3J1P3, Toronto, ON, Canada;
关键词: Language Model;    Query Term;    Mean Average Precision;    Information Retrieval System;    Reciprocal Method;   
DOI  :  10.1186/1471-2105-12-S5-S6
来源: Springer
PDF
【 摘 要 】

BackgroundThe users desire to be provided short, specific answers to questions and put them in context by linking original sources from the biomedical literature. Through the use of information retrieval technologies, information systems retrieve information to index data based on all kinds of pre-defined searching techniques/functions such that various ranking strategies are designed depending on different sources. In this paper, we propose a robust approach to optimizing multi-source information for improving genomics retrieval performance.ResultsIn the proposed approach, we first consider a common scenario for a metasearch system that has access to multiple baselines with retrieving and ranking documents/passages by their own models. Then, given selected baselines from multiple sources, we investigate three modified fusion methods in the proposed approach, reciprocal, CombMNZ and CombSUM, to re-rank the candidates as the outputs for evaluation. Our empirical study on both 2007 and 2006 genomics data sets demonstrates the viability of the proposed approach for obtaining better performance. Furthermore, the experimental results show that the reciprocal method provides notable improvements on the individual baseline, especially on the passage2-level MAP and the aspect-level MAP.ConclusionsFrom the extensive experiments on two TREC genomics data sets, we draw the following conclusions. For the three fusion methods proposed in the robust approach, the reciprocal method outperforms the CombMNZ and CombSUM methods obviously, and CombSUM works well on the passage2-level when compared with CombMNZ. Based on the multiple sources of DFR, BM25 and language model, we can observe that the alliance of giants achieves the best result. Meanwhile, under the same combination, the better the baseline performance is, the more contribution the baseline provides. These conclusions are very useful to direct the fusion work in the field of biomedical information retrieval.

【 授权许可】

Unknown   
© Hu et al; licensee BioMed Central Ltd. 2011. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311106613566ZK.pdf 350KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  文献评价指标  
  下载次数:0次 浏览次数:0次