1st Workshop on Social Data on the Web | |
Wikipedia Mining for Triple Extraction Enhanced by Co-reference Resolution | |
计算机科学;图书情报档案学 | |
Kotaro Nakayama | |
Others : http://CEUR-WS.org/Vol-405/paper6.pdf PID : 25363 |
|
来源: CEUR | |
【 摘 要 】
Since Wikipedia has become a huge scale database storing wide-range of human knowledge, it is a promising corpus for knowledge extraction. A considerable number of researches on Wikipedia mining have been conducted and the fact that Wikipedia is an invaluable corpus has been confirmed. Wikipedia’s impressive characteristics are not limited to the scale, but also include the dense link structure, URI for word sense disambiguation, well structured Infoboxes, and the category tree. In previous researches on this area, the category tree has been widely used to extract semantic relations among concepts on kipedia. In this paper, we try to extract triples (Subject, Predicate, Object) from Wikipedia articles, another promising resource for knowledge extraction. We propose a practical method which integrates link structure mining and parsing to enhance the extraction accuracy. The proposed method consists of two technical novelties; two parsing strategies and a co-reference resolution method.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Wikipedia Mining for Triple Extraction Enhanced by Co-reference Resolution | 299KB | download |