Parallel Breadth-First Search on Distributed Memory Systems | |
Computational Research Division ; Buluc, Aydin ; Madduri, Kamesh | |
关键词: ALGORITHMS; COMMUNICATIONS; DESIGN; DISTRIBUTION; PERFORMANCE breadth-first search; high performance graph algorithms; large-scale data analysis; graph 500; | |
DOI : 10.2172/1050644 RP-ID : LBNL-4769E PID : OSTI ID: 1050644 Others : TRN: US201218%%1054 |
|
学科分类:数学(综合) | |
美国|英语 | |
来源: SciTech Connect | |
【 摘 要 】
Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned par- allel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix- partitioning-based approach that mitigates parallel commu- nication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execu- tion regimes in which these approaches will be competitive, and we demonstrate extremely high performance on lead- ing distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201704210002544LZ | 970KB | download |