会议论文

【摘要】

This working paper reports on the early stages of our contribution to theTh(IC) project, in which, together with other French research teams, we want to test and demonstrate the interest of corpus analysis methods to design domain knowledge models. The project should lead to produce a thesaurus in French about KE research. The main stages of the method that we apply to thisexeprimentare (a) setting up a corpus, (b) selecting, adapting and combining the use of relevant NLP tools, (c) interpreting and validating their results, from which terms, lexical relations or classes are extracted, and finally (d) structuring them into a semantic network. We present the LEXTER system used to automatically extract from a corpus a list of term candidates that could later be considered as descriptors. We also comments upon the validation protocol that we set up : it relies on an interface

【预览】

附件列表
Files	Size	Format	View
The Th(IC)2 Initiative: CorpusBased Thesaurus Construction for Indexing WWW Documents	252KB	PDF	download

EKAW'2000 Workshop on Ontologies and Texts
The Th(IC)2 Initiative: CorpusBased Thesaurus Construction for Indexing WWW Documents
计算机科学;社会科学（总论）
Nathalie Aussenac-Gilles* and Didier Bourigault ; Université Toulouse Le Mirail ; Etudes et Recherches en Syntaxe et Sémantique (ERSS) Maison de la recherche ; 5 ; allées Antonio Machado ; 31048 TOULOUSE Cedex (F)
PID : 80289

来源: CEUR
PDF


	文献评价指标
	下载次数：3次	浏览次数：20次

【 摘 要 】

【 预 览 】

【摘要】

【预览】