科技报告详细信息
Fine Grained Classification of Named Entities In Wikipedia
Tkachenko, Maksim ; Ulanov, Alexander ; Simanovsky, Andrey
HP Development Company
关键词: named entity recognition;    Wikipedia;    classification;   
RP-ID  :  HPL-2010-166
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

This report describes the study on classifying Wikipedia articles into an extended set of named entity classes. We employed semi-automatic method to extend Wikipedia class annotation and created a training set for 15 named entity classes. We implemented two classifiers. A binary named-entity classifier decides between articles about named entities and other articles. A support vector machine (SVM) classifier trained on a variety of Wikipedia features determines the class of a named entity. Combination of the two classifiers helped us to boost classification quality and obtain classification quality that is better than state of the art.

【 预 览 】
附件列表
Files Size Format View
RO201804100002700LZ 189KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:35次