Data Science and Engineering | |
A Workload-Adaptive Streaming Partitioner for Distributed Graph Stores | |
Mengchi Liu1  Ali Davoudian2  Liu Chen3  Hongwei Tu3  | |
[1] Guangzhou Key Laboratory of Big Data and Intelligent Education, School of Computer Science, South China Normal University, Guangzhou, China;School of Computer Science, Carleton University, Ottawa, Canada;School of Computer Science, Wuhan University, Wuhan, China; | |
关键词: Graph partitioning; Streaming; Workload-adaptive; Topology-aware; Dynamic workloads; | |
DOI : 10.1007/s41019-021-00156-2 | |
来源: Springer | |
【 摘 要 】
Streaming graph partitioning methods have recently gained attention due to their ability to scale to very large graphs with limited resources. However, many such methods do not consider workload and graph characteristics. This may degrade the performance of queries by increasing inter-node communication and computational load imbalance. Moreover, existing workload-aware methods cannot consistently provide good performance as they do not consider dynamic workloads that keep emerging in graph applications. We address these issues by proposing a novel workload-adaptive streaming partitioner named WASP, that aims to achieve low-latency and high-throughput online graph queries. As each workload typically contains frequent query patterns, WASP exploits the existing workload to capture active vertices and edges which are frequently visited and traversed, respectively. This information is used to heuristically improve the quality of partitions either by avoiding the concentration of active vertices in a few partitions proportional to their visit frequencies or by reducing the probability of the cut of active edges proportional to their traversal frequencies. In order to assess the impact of WASP on a graph store and to show how easily the approach can be plugged on top of the system, we exploit it in a distributed graph-based RDF store. Our experiments over three synthetic and real-world graph datasets and the corresponding static and dynamic query workloads show that WASP achieves a better query performance against state-of-the-art graph partitioners, especially in dynamic query workloads.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202107073579609ZK.pdf | 2404KB | download |