期刊论文详细信息
Electronics
A Systolic Accelerator for Neuromorphic Visual Recognition
Shuo Tian1  Lei Wang1  Weixia Xu1  Shasha Guo1  Zhijie Yang1  Jianfeng Zhang1  Shi Xu2 
[1] College of Computer Science and Technology, National University of Defense Technology, Changsha 410000, China;National Innovation Institute of Defense Technology, Beijing 100000, China;
关键词: neuromorphic algorithm;    HMAX model;    systolic array;    hardware accelerator;   
DOI  :  10.3390/electronics9101690
来源: DOAJ
【 摘 要 】

Advances in neuroscience have encouraged researchers to focus on developing computational models that behave like the human brain. HMAX is one of the potential biologically inspired models that mimic the primate visual cortex’s functions and structures. HMAX has shown its effectiveness and versatility in multi-class object recognition with a simple computational
structure. It is still a challenge to implement the HMAX model in embedded systems due to the heaviest computational S2 phase of HMAX. Previous implementations such as CoRe16 have used a reconfigurable two-dimensional processing element (PE) array to speed up the S2 layer for HMAX. However, the adder tree mechanism in CoRe16 used to produce output pixels by accumulating partial sums in different PEs increases the runtime for HMAX. To speed up the execution process of the S2 layer in HMAX, in this paper, we propose SAFA (systolic accelerator for HMAX), a systolic-array based architecture to compute and accelerate the S2 stage of HMAX. Using the output stationary (OS) dataflow, each PE in SAFA not only calculates the output pixel independently without additional accumulation of partial sums in multiple PEs, but also reduces the multiplexers applied in
reconfigurable accelerators. Besides, data forwarding for the same input or weight data in OS reduces the memory bandwidth requirements. The simulation results show that the runtime of the heaviest computational S2 stage in HMAX model is decreased by 5.7%, and the bandwidth required for
memory is reduced by 3.53× on average by different kernel sizes (except for kernel = 12) compared with CoRe16. SAFA also obtains lower power and area costs than other reconfigurable accelerators from synthesis on ASIC.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次