科技报告

【摘要】

Many data types arising from data mining applications can be modeled as bipartite graphs, examples include terms and documents in a text corpus, customers and purchasing items in market basket analysis and reviewers and movies in a movie recommender system. In this paper, the authors propose a new data clustering method based on partitioning the underlying biopartite graph. The partition is constructed by minimizing a normalized sum of edge weights between unmatched pairs of vertices of the bipartite graph. They show that an approximate solution to the minimization problem can be obtained by computing a partial singular value decomposition (SVD) of the associated edge weight matrix of the bipartite graph. They point out the connection of their clustering algorithm to correspondence analysis used in multivariate analysis. They also briefly discuss the issue of assigning data objects to multiple clusters. In the experimental results, they apply their clustering algorithm to the problem of document clustering to illustrate its effectiveness and efficiency.

【预览】

附件列表
Files	Size	Format	View
816202.pdf	241KB	PDF	download


Bipartite graph partitioning and data clustering

Zha, Hongyuan ; He, Xiaofeng ; Ding, Chris ; Gu, Ming ; Simon, Horst D.
Lawrence Berkeley National Laboratory
关键词: Mining; 99 General And Miscellaneous//Mathematics, Computing, And Information Science; Minimization; Multivariate Analysis; Efficiency;
DOI : 10.2172/816202 RP-ID : LBNL--47970 RP-ID : AC03-76SF00098 RP-ID : 816202
美国\|英语
来源: UNT Digital Library
PDF


	文献评价指标
	下载次数：8次	浏览次数：29次

【 摘 要 】

【 预 览 】

【摘要】

【预览】