学位论文详细信息
Web Information Retrieval using Web Document Structures.
Word relevance;Information Extraction;Web Mining;Information Retrieval;Word Measures
Namjoshi, Nihar ; Dr. Robert StAmant, Committee Chair,Dr. Christopher Healey, Committee Member,Dr. James Lester, Committee Member,Namjoshi, Nihar ; Dr. Robert StAmant ; Committee Chair ; Dr. Christopher Healey ; Committee Member ; Dr. James Lester ; Committee Member
University:North Carolina State University
关键词: Word relevance;    Information Extraction;    Web Mining;    Information Retrieval;    Word Measures;   
Others  :  https://repository.lib.ncsu.edu/bitstream/handle/1840.16/896/etd.pdf?sequence=1&isAllowed=y
美国|英语
来源: null
PDF
【 摘 要 】

Information domains such as the World Wide Web have enormous information content. The task of extracting information relevant to a particular topic, or trying to predict what sort of information a user is seeking is not a trivial task. For a user, finding information relevant to a particular area of interest can be inconvenient and sometimes frustrating as well. Studies have shown that when users are faced with such a task, they may get easily bored and thus leave a Web site. Traditional Information Retrieval techniques rely on measures such as the frequency of a word in a given document, or the hyperlink connectivity of that particular web document. This approach may not necessarily bring out the important words or terms in a document and thus could be less effective while returning search results for queries. In our approach, we rely not only on the actual text in the document, but we also use the inherent formatting elements in Web pages, derived from the Hyper Text Markup Language (HTML) syntax to support our process of information extraction. We use rules to assign measures to important terms in a document in order to facilitate the relevant Information Extraction. We evaluated our system by asking users to test it and in addition, we compared our results with the results from a conventional search engine.

【 预 览 】
附件列表
Files Size Format View
Web Information Retrieval using Web Document Structures. 615KB PDF download
  文献评价指标  
  下载次数:42次 浏览次数:46次