学位论文详细信息
Clustering based causal topic mining
Text mining;Topic models;Time series
Mohan, Vishaal ; Zhai ; ChengXiang
关键词: Text mining;    Topic models;    Time series;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/97626/MOHAN-THESIS-2017.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Events in the world generate an enormous amount of textual data like tweets and news articles. These events also manifest in the form of changes to time-series numeric data. This thesis deals with the problemof extracting these events from the timestamped document collection in the form of topics that cause a change in a time-series. We develop a conceptual framework for that can be used to analyze different causal topic mining algorithms. We also propose two novel clustering based algorithms - cCTM-CF and cCTM-CoF to generate causal topics. We evaluate these algorithms both qualitatively, and quantitatively by comparing their coherence and correlation scores to that of the baseline generative causal topic model - gCTM. We found that cCTM-CoF performs 35% and 62.5% better according to these metrics as compared to the baseline.

【 预 览 】
附件列表
Files Size Format View
Clustering based causal topic mining 377KB PDF download
  文献评价指标  
  下载次数:88次 浏览次数:31次