会议论文详细信息
Workshop on Mastering the Gap, From Information Extraction to Semantic Representation
On the Design and Exploitation of Presentation Ontologies for Information Extraction
Martin Labský ; Vojtěch Svátek
Others  :  http://CEUR-WS.org/Vol-187/26.pdf
PID  :  12362
来源: CEUR
PDF
【 摘 要 】

The structure of ontologies that are considered as input to information extraction is mostly rather simple. In this paper we report on our ongoing effort of using rich ontologies with numerous constraints over the information to be extracted. Important aspects of the approach are the coupling of user-defined ontologies with other sources of knowledge such as training data and document formatting structures, and the distinction between proper domain ontologies and so-called presentation ontologies, where the latter (as 'pragmatic bridges' over the 'semantic gap') can partially be derived from the former. The extraction tool under construction builds on experience from an ongoing application in the domain of product catalogue analysis, and attempts to offer high flexibility with respect to availability of various input information sources.

【 预 览 】
附件列表
Files Size Format View
On the Design and Exploitation of Presentation Ontologies for Information Extraction 151KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:34次