期刊论文详细信息
Xibei Gongye Daxue Xuebao
Design of Deep Learning VLIW Processor for Image Recognition
关键词: image recognition;    deep learning;    convolutional neural networks;    very long instruction word(vliw);    processor;    extensible;   
DOI  :  10.1051/jnwpu/20203810216
来源: DOAJ
【 摘 要 】

In order to adapt the application demands of high resolution images recognition and efficient processing of localization in aviation and aerospace fields, and to solve the problem of insufficient parallelism in existing researches, an extensible multiprocessor cluster deep learning processor architecture based on VLIW is designed by optimizing the computation of each layer of deep convolutional neural network model. Parallel processing of feature maps and neurons, instruction level parallelism based on very long instruction word (VLIW), data level parallelism of multiprocessor clusters and pipeline technologies are adopted in the design. The test results based on FPGA prototype system show that the processor can effectively complete the image classification and object detection applications. The peak performance of processor is up to 128 GOP/s when it operates at 200 MHz. For selecting benchmarks, the processor speed is about 12X faster than CPU and 7X faster than GPU at least. Comparing with the results of the software framework, the average error of the test accuracy of the processor is less than 1%.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次