2018 2nd International Conference on Artificial Intelligence Applications and Technologies | |
Long Tail Constraint on Non-negative Matrix Factorization | |
计算机科学 | |
Jia, Quanye^1 ; Liu, Rui^1 ; Zhang, He^1 ; You, Lu^1 | |
School of Computer Science and Engineering, Beihang University, Beijing, China^1 | |
关键词: Document matrices; Feature matrices; Matrix factorizations; Mutual informations; Nonnegative matrix factorization; Orthogonal constraints; Text classification; Topic distributions; | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/435/1/012055/pdf DOI : 10.1088/1757-899X/435/1/012055 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
![]() |
【 摘 要 】
The topic distribution in text generally has long tail effect, but few people do research on how to dig out long tail topics from matrix factorization. So we propose a method, this is Non-negative Matrix Factorization with Long-tail Constraint (LTNMF). LTNMF adds the soft orthogonal constraints to the feature matrix to ensure the independence of the topics on the basis of the non-negative matrix factorization. The sparse constraints and long tail constraints are added to the topic document matrix to enhance the robustness of the model and the characterization of the long tail features of the topic distribution. The combination of soft orthogonal constraints, sparse constraints and long tail constraints enables the model to extract the long tail topic information in the data and ensure the quality of the topic. We use Sougou and 20newsgroup datasets to experiment, and the results show that LTMNF can dig more topic words and improve the accuracy and the standard mutual information of clustering in text classification.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Long Tail Constraint on Non-negative Matrix Factorization | 565KB | ![]() |