15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research | |
Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network | |
物理学;计算机科学 | |
Ammendola A, R.^1 ; Biagioni, A.^2 ; Frezza, O.^2 ; Lo Cicero, F.^2 ; Lonardo, A.^2 ; Paolucci, P.S.^2 ; Rossetti, D.^2 ; Simula, F.^2 ; Tosoratto, L.^2 ; Vicini, P.^2 | |
INFN Roma II, Via della Ricerca Scientifica, 1-00133 Rome, Italy^1 | |
INFN Roma i, P.le Aldo Moro 2, 00185 Rome, Italy^2 | |
关键词: Analysis of performance; Data transaction; Embedded microprocessors; Hardware implementations; High performance scientific computing; Interconnect fabrics; Interconnect networks; Specialized hardware; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/523/1/012013/pdf DOI : 10.1088/1742-6596/523/1/012013 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dimensional torus interconnect network optimized for hybrid clusters CPU-GPU dedicated to High Performance scientific Computing. The APEnet+ interconnect fabric is built on a FPGA-based PCI-express board with 6 bi-directional off-board links showing 34 Gbps of raw bandwidth per direction, and leverages upon peer-to-peer capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, GPU-to-GPU low latency transfers. The minimization of APEnet+ transfer latency is achieved through the adoption of RDMA protocol implemented in FPGA with specialized hardware blocks tightly coupled with embedded microprocessor. This architecture provides a high performance low latency offload engine for both trasmit and receive side of data transactions: preliminary results are encouraging, showing 50% of bandwidth increase for large packet size transfers. In this paper we describe the APEnet+ architecture, detailing the hardware implementation and discuss the impact of such RDMA specialized hardware on host interface latency and bandwidth.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network | 1844KB | download |