Web document clustering using hyperlink structures | |
He, Xiaofeng ; Zha, Hongyuan ; Ding, Chris H.Q ; Simon, Horst D. | |
Lawrence Berkeley National Laboratory | |
关键词: Management; 99 General And Miscellaneous//Mathematics, Computing, And Information Science; Computers; Engines; Information Retrieval; | |
DOI : 10.2172/815474 RP-ID : LBNL--47971 RP-ID : AC03-76SF00098 RP-ID : 815474 |
|
美国|英语 | |
来源: UNT Digital Library | |
【 摘 要 】
With the exponential growth of information on the World Wide Web there is great demand for developing efficient and effective methods for organizing and retrieving the information available. Document clustering plays an important role in information retrieval and taxonomy management for the World Wide Web and remains an interesting and challenging problem in the field of web computing. In this paper we consider document clustering methods exploring textual information hyperlink structure and co-citation relations. In particular we apply the normalized cut clustering method developed in computer vision to the task of hyperdocument clustering. We also explore some theoretical connections of the normalized-cut method to K-means method. We then experiment with normalized-cut method in the context of clustering query result sets for web search engines.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
815474.pdf | 611KB | download |