学位论文

【摘要】

With massive datasets accumulating in text repositories (e.g., news articles, customer reviews, etc.), it is highly desirable to systematically utilize and explore them by data mining, NLP and database techniques. In our view, documents in text corpora contain informative explicit meta-attributes (e.g., category, date, author, etc.) and implicit attributes (e.g., sentiment), forming one or a set of highly-structured multi-dimensional spaces. Much knowledge can be derived if we develop effective and efficient multi-dimensional summarization, exploration and analysis technologies.In this demo, we propose an end-to-end, real-time analytical platform TextDive for processing massive text data, and provide valuable insights to general data consumers. First, we develop a set of information extraction, entity typing and text mining methods to extract consolidated dimensions and automatically construct multi-dimensional textual spaces (i.e., text cubes). Furthermore, we develop a set of OLAP-like text summarization, data exploration and text analysis mechanisms that understand semantics of text corpora in multi-dimensional spaces. We also develop an efficient computational solution that involves materializing selective statistics to guarantee the interactive and real-time nature of TextDive.

【预览】

附件列表
Files	Size	Format	View
TextDive: construction, summarization and exploration of multi-dimensional text corpora	723KB	PDF	download


TextDive: construction, summarization and exploration of multi-dimensional text corpora
multi-dimensional text corpora analysis;text cube analysis;text summarization
Wang, Qi ; Han ; Jiawei
关键词: multi-dimensional text corpora analysis; text cube analysis; text summarization;
Others : https://www.ideals.illinois.edu/bitstream/handle/2142/90938/WANG-THESIS-2016.pdf?sequence=1&isAllowed=y
美国\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：3次	浏览次数：3次

【 摘 要 】

【 预 览 】

【摘要】

【预览】