期刊论文详细信息
Computer Science and Information Systems
Ontology-based multi-label classification of economic articles
Sergeja Vogrinčič1  Zoran Bosnić2 
[1] Jožef Stefan International Postgraduate School;University of Ljubljana, Faculty of Computer and Information Science
关键词: ontology;    multi-label classification;    machine learning;    text categorization;    economics;    document classification;   
DOI  :  10.2298/CSIS100420034V
学科分类:社会科学、人文和艺术(综合)
来源: Computer Science and Information Systems
PDF
【 摘 要 】

The paper presents an approach to the task of automatic document categorization in the field of economics. Since the documents can be annotated with multiple keywords (labels), we approach this task by applying and evaluating multi-label classification methods of supervised machine learning. We describe forming a test corpus of 1015 economic documents that we automatically classify using a tool which integrates ontology construction with text mining methods. In our experimental work, we evaluate three groups of multi-label classification approaches: transformation to single-class problems, specialized multi-label models, and hierarchical/ranking models. The classification accuracies of all tested classification models indicate that there is a potential for using all of the evaluated methods to solve this task. The results show the benefits of using complex groups of approaches which benefit from exploiting dependence between the labels. A good alternative to these approaches is also single-class naive Bayes classifiers coupled with the binary relevance transformation approach.

【 授权许可】

CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO201904027492315ZK.pdf 463KB PDF download
  文献评价指标  
  下载次数:19次 浏览次数:22次