学位论文

【摘要】

Over the past decade, networks have become an increasingly popular abstraction for problems in the physical, life, social and information sciences. Network analysis can be used to extract insights into an underlying system from the structure of its network representation. One of the challenges of applying network analysis is the fact that networks do not always have an observed and complete structure. This dissertation focuses on the problem of imputation and/or inference in the presence of incomplete network structures. I propose four novel systems, each of which, contain a module that involves the inference or imputation of an incomplete network that is necessary to complete the end task.I first propose EdgeBoost, a meta-algorithm and framework that repeatedly applies a non-deterministic link predictor to improve the efficacy of community detection algorithms on networks with missing edges. On average EdgeBoost improves performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook.The second system, Butterworth, identifies a social network user;;s topic(s) of interests and automatically generates a set of social feed ``rankers;;;; that enable the user to see topic specific sub-feeds. Butterworth uses link prediction to infer the missing semantics between members of a user;;s social network in order to detect topical clusters embedded in the network structure. For automatically generated topic lists, Butterworth achieves an average top-10 precision of 78%, as compared to a time-ordered baseline of 45%. Next, I propose Dobby, a system for constructing a knowledge graph of user-defined keyword tags. Leveraging a sparse set of labeled edges, Dobby trains a supervised learning algorithm to infer the hypernym relationships between keyword tags. Dobby was evaluated by constructing a knowledge graph of LinkedIn;;s skills dataset, achieving an average precision of 85% on a set of human labeled hypernym edges between skills. Lastly, I propose Lobbyback, a system that automatically identifies clusters of documents that exhibit text reuse and generates ``prototypes;;;; that represent a canonical version of text shared between the documents. Lobbyback infers a network structure in a corpus of documents and uses community detection in order to extract the document clusters.

【预览】

附件列表
Files	Size	Format	View
Network Analysis on Incomplete Structures.	4813KB	PDF	download


Network Analysis on Incomplete Structures.
network analysis;text mining;link prediction;community detection;missing links;Computer Science;Engineering;Computer Science and Engineering
Burgess, MatthewMei, Qiaozhu ;
University of Michigan
关键词: network analysis; text mining; link prediction; community detection; missing links; Computer Science; Engineering; Computer Science and Engineering;
Others : https://deepblue.lib.umich.edu/bitstream/handle/2027.42/133443/mattburg_1.pdf?sequence=1&isAllowed=y
瑞士\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：36次	浏览次数：31次

【 摘 要 】

【 预 览 】

【摘要】

【预览】