IEEE Access | 卷:7 |
PartNRL: Partial Nodes Representation Learning in Large-Scale Network | |
Dong Huang1  Jian-Huang Lai2  Juan-Hui Li2  Ling Huang2  Chang-Dong Wang2  | |
[1] College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China; | |
[2] Key Laboratory of Machine Intelligence and Advanced Computing, School of Data and Computer Science, Ministry of Education, Sun Yat-sen University, Guangzhou, China; | |
关键词: Network embedding; large-scale network; partial nodes; representation learning; | |
DOI : 10.1109/ACCESS.2019.2913449 | |
来源: DOAJ |
【 摘 要 】
Recently, the low-dimensional embedding of nodes has received a large amount of attention in the field of network analysis. While the existing methods mostly focus on the network embedding of the entire network, there are also some situations, where people may only be interested in some nodes (i.e., partial nodes) rather than all nodes, especially in large-scale networks. Although there are some approaches dealing with the large-scale network, most of them require that all nodes are present during the optimization procedure. The necessity to generate the embedding results for redundant non-interested nodes makes these methods quite inefficient. In this paper, we present a novel node representation framework termed partial nodes representation learning (PartNRL), which is capable of preserving the local similarity of the interested node pairs and generating the embedding results efficiently. Two phases are carefully designed in PartNRL. The first phase is to use local random walk to capture the t -step local similarity for memory-efficiency. By modeling the inherent properties of nodes through the first phase, we design the second phase to learn the node representations by maximizing a likelihood function based on the skip-gram model. To make the optimization procedure more efficient, the negative sampling strategy is applied. The extensive experiments have been conducted on three tasks: node clustering task, single-labeled classification task, and multi-labeled classification task. The experimental results confirm the superior performance of the proposed model over the state-of-the-art network embedding methods.
【 授权许可】
Unknown