Network data has arisen as one of the most common forms of information collection. This is due to the fact that the scope of studies not only focuses on subjects alone, but also on the relationships among subjects. In this thesis, we address two major challenges in the network analysis.In the first part of the thesis, we focus on the detection of community structure in the network. In practical, within-community members are more likely to be connected than between-community members, which is also reflected in that the edges within a community are intercorrelated. However, existing probabilistic models for community detection such as the stochastic block model (SBM) are not designed to capture the dependence among edges. In the first part, we propose a novel community detection approach to incorporate intra-community dependence of connectivities through the Bahadur representation. The proposed method does not require specifying the likelihood function, which could be intractable for correlated binary connectivities. In addition, the proposed method allows for heterogeneity among edges among different communities. In theory, we show that incorporating correlation information can achieve a faster convergence rate compared to the independent SBM, and the proposed algorithm has a lower estimation bias and accelerated convergence speed compared to the variational EM. Our simulation studies show that the proposed algorithm outperforms the existing variational EM algorithm assuming conditional independence among edges. We also demonstrate the application of the proposed method to agricultural product trading networks from different countries.In the second part, we focus on the joint prediction of pairwise link and hyperlink under multi-layer networks to incorporate high-order relations in network, which are not considered in the traditional graph representation models which only predict two-way pairwise relations. We propose a novel joint network embedding approach on simultaneously encoding pairwise links and hyper- links onto a latent space to capture the dependency between pairwise and multi-way links, which allows inference of potential unobserved hyperlinks. The major advantage of the proposed embedding procedure is that it incorporates both the pairwise relationships and subgroup-wise structure among nodes to utilize high-order network information. In addition, the proposed method introduces the hierarchical dependency among links to infer potential hyperlinks, and leads to a better link prediction. In theory, we establish the estimation consistency for the proposed embedding approach, and provide a faster converge rate compared to hyperlink prediction using pairwise links only. Numerical studies on both simulation settings and Facebook ego-network show that the proposed method improves both hyperlink and pairwise link predictions accuracy compared to the existing link prediction methods.
【 预 览 】
附件列表
Files
Size
Format
View
Approximate likelihood for dependent networks and hyperlink predictions