ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS | |
Sparse BD-Net: A Multiplication-less DNN with Sparse Binarized Depth-wise Separable Convolution | |
Article | |
He, Zhezhi1  Yang, Li1  Angizi, Shaahin2  Rakin, Adnan Siraj1  Fan, Deliang1  | |
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, 650 E Tyler Mall, Tempe, AZ 85287 USA.;Univ Cent Florida, Dept ECE, 4328 Scorpius St, Orlando, FL 32816 USA. | |
关键词: Deep neural network; model compression; in-memory computing; | |
DOI : 10.1145/3369391 | |
来源: SCIE | |
【 摘 要 】
In this work, we propose a multiplication-less binarized depthwise-separable convolution neural network, called BD-Net. BD-Net is designed to use binarized depthwise separable convolution block as the drop-in replacement of conventional spatial-convolution in deep convolution neural network (DNN). In BD-Net, the computation-expensive convolution operations (i.e., Multiplication and Accumulation) are converted into energy-efficient Addition/Subtraction operations. For further compressing the model size while maintaining the dominant computation in addition/subtraction, we propose a brand-new sparse binarization method with a hardware-oriented structured sparsity pattern. To successfully train such sparse BD-Net, we propose and leverage two techniques: (1) a modified group-lasso regularization whose group size is identical to the capacity of basic computing core in accelerator and (2) a weight penalty clipping technique to solve the disharmony issue between weight binarization and lasso regularization. The experiment results show that the proposed sparse BD-Net can achieve comparable or even better inference accuracy, in comparison to the full precision CNN baseline. Beyond that, a BD-Net customized process-in-memory accelerator is designed using SOT-MRAM, which owns characteristics of high channel expansion flexibility and computation parallelism. Through the detailed analysis from both software and hardware perspectives, we provide an intuitive design guidance for software/hardware co-design of DNN acceleration on mobile embedded systems. Note that this journal submission is the extended version of our previous published paper in ISVLSI 2018 [24].
【 授权许可】
Free
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202303094734188ZK.pdf | 2745KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]