Workshop on Mastering the Gap, From Information Extraction to Semantic Representation | |
On the Design and Exploitation of Presentation Ontologies for Information Extraction | |
Martin Labský ; Vojtěch Svátek | |
Others : http://CEUR-WS.org/Vol-187/26.pdf PID : 12362 |
|
来源: CEUR | |
【 摘 要 】
The structure of ontologies that are considered as input to information extraction is mostly rather simple. In this paper we report on our ongoing effort of using rich ontologies with numerous constraints over the information to be extracted. Important aspects of the approach are the coupling of user-defined ontologies with other sources of knowledge such as training data and document formatting structures, and the distinction between proper domain ontologies and so-called presentation ontologies, where the latter (as 'pragmatic bridges' over the 'semantic gap') can partially be derived from the former. The extraction tool under construction builds on experience from an ongoing application in the domain of product catalogue analysis, and attempts to offer high flexibility with respect to availability of various input information sources.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
On the Design and Exploitation of Presentation Ontologies for Information Extraction | 151KB | download |