Electronics | 卷:9 |
Novel CNN-Based AP2D-Net Accelerator: An Area and Power Efficient Solution for Real-Time Applications on Mobile FPGA | |
Kuangyuan Sun1  Yukui Luo2  Shuai Li3  Nandakishor Yadav3  Ken Choi3  | |
[1] Department of Computer Science, Rice University, Houston, TX 77005, USA; | |
[2] Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA; | |
[3] VLSI Design and Automation Laboratory, Illinois Institute of Technology, Chicago, IL 60616, USA; | |
关键词: deep neural network accelerator; FPGA; UAV; pipeline architecture; parallel computing; binary neural network; | |
DOI : 10.3390/electronics9050832 | |
来源: DOAJ |
【 摘 要 】
Standard convolutional neural networks (CNNs) have large amounts of data redundancy, and the same accuracy can be obtained even in lower bit weights instead of floating-point representation. Most CNNs have to be developed and executed on high-end GPU-based workstations, for which it is hard to transplant the existing implementations onto portable edge FPGAs because of the limitation of on-chip block memory storage size and battery capacity. In this paper, we present adaptive pointwise convolution and 2D convolution joint network (AP2D-Net), an ultra-low power and relatively high throughput system combined with dynamic precision weights and activation. Our system has high performance, and we make a trade-off between accuracy and power efficiency by adopting unmanned aerial vehicle (UAV) object detection scenarios. We evaluate our system on the Zynq UltraScale+ MPSoC Ultra96 mobile FPGA platform. The target board can get the real-time speed of 30 fps under 5.6 W, and the FPGA on-chip power is only 0.6 W. The power efficiency of our system is 2.8× better than the best system design on a Jetson TX2 GPU and 1.9× better than the design on a PYNQ-Z1 SoC FPGA.
【 授权许可】
Unknown