期刊论文详细信息
IEEE Access
Accelerating Fully Homomorphic Encryption Through Architecture-Centric Analysis and Optimization
Wonkyung Jung1  Jung Ho Ahn1  Namhoon Kim1  Keewoo Lee2  Jung Hee Cheon2  Sangpyo Kim2  Chohong Min3  Jongmin Kim4  Eojin Lee5 
[1] Department of Intelligence and Information, Seoul National University, Seoul, South Korea;Department of Mathematical Sciences, Seoul National University, Seoul, South Korea;Department of Mathematics, Ewha Womans University, Seoul, South Korea;Interdisciplinary Program~in Artificial Intelligence, Seoul National University, Seoul, South Korea;Samsung Electronics, Suwon, South Korea;
关键词: Computer applications;    computer architecture;    cryptography;    multicore processing;   
DOI  :  10.1109/ACCESS.2021.3096189
来源: DOAJ
【 摘 要 】

Homomorphic Encryption (HE) has drawn significant attention as a privacy-preserving approach for cloud computing because it allows computation on encrypted messages called ciphertexts. Among the numerous HE schemes proposed thus far, HE for Arithmetic of Approximate Numbers (HEAAN) is rapidly gaining in popularity across a wide range of applications, as it supports messages that can tolerate approximate computations with no limit on the number of arithmetic operations applicable to the ciphertexts. A critical shortcoming of HE is the high computation complexity of ciphertext arithmetic; specifically, HE multiplication (HE Mul) is more than 10,000 times slower than the corresponding multiplication between unencrypted messages. This has led to a large body of HE acceleration studies, including those that exploit FPGAs; however, a rigorous analysis of the computational complexity and data access patterns of HE Mul is lacking. Moreover, the proposals mostly focused on designs with small parameter sizes, making it difficult accurately to estimate the performance of the HE accelerators when conducting a series of complex arithmetic operations. In this paper, we first describe how HE Mul of HEAAN is performed in a manner friendly to non-crypto experts. Then, we conduct a disciplined analysis of its computational and memory-access characteristics, through which we (1) extract parallelism in the key functions composing HE Mul and (2) demonstrate how to map the parallelism effectively to popular parallel processing platforms, CPUs and GPUs, by applying a series of optimizations such as transposing matrices and pinning data to threads. This leads to performance improvements of HE Mul on a CPU and a GPU by $2.06\times $ and $4.05\times $ , respectively, over the reference HEAAN running on a CPU with 24 threads.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:2次