IEICE Electronics Express | |
Sample-wise dynamic precision quantization for neural network acceleration | |
article | |
Bowen Li1  Dongliang Xiong1  Kai Huang1  Xiaowen Jiang1  Hao Yao2  Junjian Chen2  Luc Claesen3  | |
[1] School of Micro-Nano Electronics, Zhejiang University;Digital Grid Research Institute;Engineering Technology-Electronics-ICT Department, University of Hasselt | |
关键词: convolutional neural networks; dynamic quantization; hardware accelerators; | |
DOI : 10.1587/elex.19.20220229 | |
学科分类:电子、光学、磁材料 | |
来源: Denshi Jouhou Tsuushin Gakkai | |
【 摘 要 】
Quantization is a well-known method for deep neural networks (DNNs) compression and acceleration. In this work, we propose the Sample-Wise Dynamic Precision (SWDP) quantization scheme, which can switch the bit-width of weights and activations in the model according to the task difficulty of input samples at runtime. Using low-precision networks for easy input images brings advantages in terms of computational and energy efficiency. We also propose an adaptive hardware design for the efficient implementation of our SWDP networks. The experimental results on various networks and datasets demonstrate that our SWDP achieves an average of 3.3× speedup and 3.0× energy saving over the bit-level dynamically composable architecture BitFusion.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202306290004490ZK.pdf | 2679KB | download |