Computer Science and Information Systems | |
Ontology-based multi-label classification of economic articles | |
Sergeja VogrinÄiÄ1  Zoran BosniÄ2  | |
[1] Jožef Stefan International Postgraduate School;University of Ljubljana, Faculty of Computer and Information Science | |
关键词: ontology; multi-label classification; machine learning; text categorization; economics; document classification; | |
DOI : 10.2298/CSIS100420034V | |
学科分类:社会科学、人文和艺术(综合) | |
来源: Computer Science and Information Systems | |
【 摘 要 】
The paper presents an approach to the task of automatic document categorization in the field of economics. Since the documents can be annotated with multiple keywords (labels), we approach this task by applying and evaluating multi-label classification methods of supervised machine learning. We describe forming a test corpus of 1015 economic documents that we automatically classify using a tool which integrates ontology construction with text mining methods. In our experimental work, we evaluate three groups of multi-label classification approaches: transformation to single-class problems, specialized multi-label models, and hierarchical/ranking models. The classification accuracies of all tested classification models indicate that there is a potential for using all of the evaluated methods to solve this task. The results show the benefits of using complex groups of approaches which benefit from exploiting dependence between the labels. A good alternative to these approaches is also single-class naive Bayes classifiers coupled with the binary relevance transformation approach.
【 授权许可】
CC BY-NC-ND
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201904027492315ZK.pdf | 463KB | download |