学位论文详细信息
Systematic optimization of search engines for difficult queries
Information Retrieval;Statistical Translation Language Model;Negative Feedback;Exploration-Exploitation Optimization
Karimzadehgan, Maryam
关键词: Information Retrieval;    Statistical Translation Language Model;    Negative Feedback;    Exploration-Exploitation Optimization;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/34551/karimzadehgan_maryam.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

With the advent of Web, text information is being generated across the globe at an unfathomable rate and covering countless topics. This dramatic growth in text information and the increasing number of ways people can utilize it has influenced our daily lives in fundamental and profound ways. The widespread and multi-purposed use of search engines isone example of how transformed our lives have become in relation to text information. Although the vast majority of people's information needs can be served very successfully by current search engines, there is still a considerable number ofqueries that even the best search engines perform poorly on.In this dissertation, we propose a method of optimizing a search engine to better handle such “difficultqueries”, which addresses issues and opportunities at three different stages of an interactive search.Specifically, we propose to improve search quality for difficult queries by: (1) Bridging the vocabulary gap by defining a semantic smoothing language model. A query can be difficult because it does notcontain the optimal choice of related terms, or lacks sufficient discriminative terms. In the pre-retrieval stage, using semanticsmoothing during query formulation can mitigate the effect of those omissions. (2) Incorporating user negative feedback, i.e., improving a search engine by learning from user feedback in the post-retrieval stage. When a query is so difficult that all the top retrieved results(e.g., top 10) are completely irrelevant, the feedback that a user can provide is solely negative. We propose a generalizedoptimization framework to learn from feedback on non-relevant documents to prune extensively, but carefully, a large number of non-relevantdocuments from the top of the ranking list. (3) Balancing priorities when learning from user interaction in order to optimize the whole session. When presenting the search results to the user, there is a tradeoff between promoting those with the highest immediate utility and promoting those with the best potential for collecting feedback information, which can be used to better serve the user over the course of the session (interactive search stage).We frame this tradeoff as a problem of optimizing the diversification of search results, and we propose a machine learningapproach that adaptively optimizes each individual user query such that wemaximize the overall utility of the entire session. In summary, this dissertation is expected to advance thestate of the art of search engines by providing a suite of novel search algorithms, which use the listed approaches to improve a search engine's ability to handle difficult queries.

【 预 览 】
附件列表
Files Size Format View
Systematic optimization of search engines for difficult queries 869KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:2次