期刊论文详细信息
Computer Science and Information Systems
Learning syntactic tagging of Macedonian language
Bonchanoski, Martin1 
关键词: part-of-speech tagging;    TnT tagger;    cyclic dependency network;    guided learning for bidirectional sequence classification;    dynamic features induction;   
DOI  :  10.2298/CSIS180310027B
学科分类:社会科学、人文和艺术(综合)
来源: Computer Science and Information Systems
PDF
【 摘 要 】

This paper presents the creation of machine learning based systems for Part-of-speech tagging of Macedonian language. Four well-known PoS tagger systems implemented for English and Slavic languages: TnT, cyclic dependency network, guided learning framework for bidirectional sequence classification, and dynamic features induction were trained. Orwell’s novel “1984” was manually tagged from the authors and it was used split into training and test set. After the training of the models, a comparison between the models was made. At the end, a POS tagger with an accuracy that reaches 97.5% was achieved, making it very appropriate for the future grammatical tagging of the National corpus of Macedonian language, which is currently in its initial stage. The Part-of-speech tagger that was create is published online and free to use.

【 授权许可】

CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO201911045229829ZK.pdf 676KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:11次