We explore connections of low-rank matrix factorizations with interesting problems in data mining and machine learning. We propose a framework for solving several low-rank matrix factorization problems, including binary matrix factorization, constrained binary matrix factorization, weightedconstrained binary matrix factorization, densest k-subgraph,and orthogonal nonnegative matrix factorization. These combinatorial problems are NP-hard. Our goal is to develop effective approximation algorithms with good theoretical properties andapply them to solve various real application problems. We reformulate each of the problems as a special clustering problem that has the sameoptimal solution as the corresponding original problem. Making use of this property, we develop clustering algorithms to solve correspondinglow-rank matrixfactorization problems. We prove that most of our clustering algorithms have constant approximation ratios, which is a highly desirable property for NP-hard problems. We apply the proposed algorithms and compare them with existing methods for real applications in pattern extraction, document clustering, transactiondata mining, recommender systems, bicluster discovery in geneexpression data, social network mining, and image representation.
【 预 览 】
附件列表
Files
Size
Format
View
Pattern extraction and clustering for high-dimensional discrete data