期刊论文详细信息
Mathématiques et sciences humaines. Mathematics and social sciences | |
Représentations du texte pour la classification arborée et l’analyse automatique de corpus. Application à un corpus d’historiens latins | |
Barthelemy, Jean-Pierre1  Luong, Nguyen Xuan1  Mellet, Sylvie1  Longrée, Dominique1  | |
关键词: generic classification; linear textual structures; motif (pattern); texts topological approach; tree analysis; lattice; neighbourhood; | |
DOI : 10.4000/msh.11152 | |
学科分类:数学(综合) | |
来源: College de France * Ecole des Hautes Etudes en Sciences Sociales (E H E S S) | |
【 摘 要 】
In this paper, we present different methods of automatic classification applied to a corpus of literary texts and we compare their different results; in particular we evaluate how each of them is suitable for exhibiting the generic classification of the corpus. We demonstrate that a topological approach of the texts which takes into account their linearity, i.e. the order of their micro- and macro-structures, results in better clustering than traditional quantitative methods which leave generally out of count this linear structure.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201912020428818ZK.pdf | 511KB | download |