Workshop on Mastering the Gap, From Information Extraction to Semantic Representation | |
Utilize Probabilistic Topic Models to Enrich Knowledge Bases | |
Laura Dietz ; Avaré Stewart | |
Others : http://CEUR-WS.org/Vol-187/25.pdf PID : 12365 |
|
来源: CEUR | |
【 摘 要 】
In publication driven domains such as the scientific community the availability of topic information in the form of a taxonomy and associated publications is essential. State-of-the-art methods for topic extraction in the Semantic Web community either need high manual effort (e.g. when using categorization) or rely on error prone techniques such as hierarchical clustering. We present an alternative solution that uses probabilistic topic models, a technique for unsupervised topic extraction based on statistical inference. The topic model can autonomously perform tasks that require massive data processing; such as identifying topics and associations of publications to multiple topics. Only for tasks requiring intellectual activity and for which no reliable automated techniques are available, is the user is asked for assistance. In this work we explicate how the results of the topic model are stored in a knowledge base for later reuse. It is described how the stored information can be interpreted to provide diagnostic support for the manual topic refinement. We deliniate how the extracted topic information can be exploited in an community service application for the end user.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Utilize Probabilistic Topic Models to Enrich Knowledge Bases | 325KB | download |