会议论文详细信息
Workshop on Entity-Centric Approaches to Information and Knowledge Management on the Web | |
Proling Topics on the Web | |
Aditya K. Sehgal ; Padmini Srinivasan | |
Others : http://CEUR-WS.org/Vol-249/submission_134.pdf PID : 12761 |
|
来源: CEUR | |
【 摘 要 】
The availability of large-scale data on the Web motivates the development of automatic algorithms to analyze topics and identify relationships between topics. Various approaches have been proposed in the literature. Most focus on specific entities, such as people, and not on topics in general. They are also less flexible in how they represent topics/entities. In this paper we study existing methods as well as describe preliminary research on a different approach, based on pro- files, for representing general topics. Topic profiles consist of different types of features. We compare different meth- ods for building profiles and evaluate them in terms of their information content and ability to predict relationships be- tween topics. Our results suggest that profiles derived from the full text present in multiple pages are the most informa- tive and that profiles derived from multiple pages are signif- icantly better at predicting topic relationships than profiles derived from single pages.【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Proling Topics on the Web | 193KB | download |