学位论文详细信息
Mining latent entity structures from massive unstructured and interconnected data
data mining;text mining;information network;social network;network analysis;probabilistic graphical model;topic model;phrase mining;relation mining;Information Extraction
Wang, Chi
关键词: data mining;    text mining;    information network;    social network;    network analysis;    probabilistic graphical model;    topic model;    phrase mining;    relation mining;    Information Extraction;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/72967/Chi_Wang.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

The “big data” era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone’s daily life. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured but interconnected data. Mining latent structured information around entities uncovers semanticstructures from massive unstructured data and hence enables many high-impact applications, including taxonomy or knowledge base construction, multi-dimensional data analysis and informationor social network analysis.A mining framework is proposed, to solve and integrate a chain of tasks: hierarchical topicdiscovery, topical phrase mining, entity role analysis and entity relation mining. It reveals twomain forms of structures: topical and relational structures. The topical structure summarizes thetopics associated with entities with various granularity, such as the research areas in computerscience. The framework enables recursive construction of phrase-represented and entity-enrichedtopic hierarchy from text-attached information networks. It makes breakthrough in terms of qualityand computational efficiency. The relational structure recovers the hidden relationship amongentities, such as advisor-advisee. A probabilistic graphical modeling approach is proposed. Themethod can utilize heterogeneous attributes and links to capture all kinds of semantic signals,including constraints and dependencies, to recover the hierarchical relationship with the best knownaccuracy.

【 预 览 】
附件列表
Files Size Format View
Mining latent entity structures from massive unstructured and interconnected data 2891KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:51次