期刊论文详细信息
Algorithms
Guided Semi-Supervised Non-Negative Matrix Factorization
Deanna Needell1  Christine Tseng1  Benjamin Jarman1  Pengyu Li1  Longxiu Huang1  Yaxuan Zheng1  Joyce A. Chew1 
[1] Department of Mathematics, University of California, Los Angeles, CA 90095, USA;
关键词: matrix decomposition;    topic modeling;    classification;    semi-supervised learning;    legal documents;    california innocence project;   
DOI  :  10.3390/a15050136
来源: DOAJ
【 摘 要 】

Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we propose a novel method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words. We test the performance of this method on legal documents provided by the California Innocence Project and the 20 Newsgroups dataset. Our results show that the proposed method improves both classification accuracy and topic coherence in comparison to past methods such as Semi-Supervised Non-negative Matrix Factorization (SSNMF), Guided Non-negative Matrix Factorization (Guided NMF), and Topic Supervised NMF.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:1次