US Nuclear Data Program Meeting 1999 | |
Wikipedia Mining for Triple Extraction Enhanced by Co-reference Resolution | |
Kotaro Nakayama | |
Others : http://CEUR-WS.org/Vol-405/paper6.pdf PID : 3441 |
|
来源: CEUR | |
【 摘 要 】
Since Wikipedia has become a huge scale database storing wide-range of human knowledge, it is a promising corpus for knowledge extraction. A considerable number of researches on Wikipedia mininghave been conducted and the fact that Wikipedia is an invaluable corpus has been confirmed. Wikipedia’s impressive characteristics are not limited to the scale, but also include the dense link structure, URI for wordsense disambiguation, well structured Infoboxes, and the category tree. In previous researches on this area, the category tree has been widely used to extract semantic relations among concepts on Wikipedia. In this paper, we try to extract triples (Subject, Predicate, Object) from Wikipedia articles, another promising resource for knowledge extraction. We propose a practical method which integrates link structure mining and parsing to enhance the extraction accuracy. The proposed method consists of two technical novelties; two parsing strategies and a co-reference resolution method.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Wikipedia Mining for Triple Extraction Enhanced by Co-reference Resolution | 299KB | download |