期刊论文

【摘要】

Standard convolutional neural networks (CNNs) have large amounts of data redundancy, and the same accuracy can be obtained even in lower bit weights instead of floating-point representation. Most CNNs have to be developed and executed on high-end GPU-based workstations, for which it is hard to transplant the existing implementations onto portable edge FPGAs because of the limitation of on-chip block memory storage size and battery capacity. In this paper, we present adaptive pointwise convolution and 2D convolution joint network (AP2D-Net), an ultra-low power and relatively high throughput system combined with dynamic precision weights and activation. Our system has high performance, and we make a trade-off between accuracy and power efficiency by adopting unmanned aerial vehicle (UAV) object detection scenarios. We evaluate our system on the Zynq UltraScale+ MPSoC Ultra96 mobile FPGA platform. The target board can get the real-time speed of 30 fps under 5.6 W, and the FPGA on-chip power is only 0.6 W. The power efficiency of our system is 2.8× better than the best system design on a Jetson TX2 GPU and 1.9× better than the design on a PYNQ-Z1 SoC FPGA.

【授权许可】

Unknown

Electronics	卷:9
Novel CNN-Based AP2D-Net Accelerator: An Area and Power Efficient Solution for Real-Time Applications on Mobile FPGA

Kuangyuan Sun¹ Yukui Luo² Shuai Li³ Nandakishor Yadav³ Ken Choi³
[1] Department of Computer Science, Rice University, Houston, TX 77005, USA;
[2] Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA;
[3] VLSI Design and Automation Laboratory, Illinois Institute of Technology, Chicago, IL 60616, USA;
关键词: deep neural network accelerator; FPGA; UAV; pipeline architecture; parallel computing; binary neural network;
DOI : 10.3390/electronics9050832
来源: DOAJ


	文献评价指标
	下载次数：0次	浏览次数：0次

【 摘 要 】

【 授权许可】

【摘要】

【授权许可】