学位论文详细信息
Energy efficient processing in memory architecture for deep learning computing acceleration
Processing-in-memory;Deep learning;Resistive ram;Ferroelectric FET;Machine learning computing acceleration
Long, Yun ; Mukhopadhyay, Saibal Electrical and Computer Engineering Khan, Asif Islam Kim, Hyesoon Krishna, Tushar Yu, Shimeng ; Mukhopadhyay, Saibal
University:Georgia Institute of Technology
Department:Electrical and Computer Engineering
关键词: Processing-in-memory;    Deep learning;    Resistive ram;    Ferroelectric FET;    Machine learning computing acceleration;   
Others  :  https://smartech.gatech.edu/bitstream/1853/62311/1/LONG-DISSERTATION-2019.pdf
美国|英语
来源: SMARTech Repository
PDF
【 摘 要 】

The major objective of this research is to make the processing-in-memory (PIM) based deep learning accelerator more practical and more computing efficient. This research particularly focuses on the emerging non-volatile memory (NVM) based novel architecture design and leverages the software-hardware co-optimization to achieve the optimal computing efficiency without compromising the accuracy. From the emerging memory perspective, this research mainly explores resistive ram (ReRAM) and Ferroelectrical FET (FeFet). A dedicated recurrent neural network (RNN) accelerator is proposed which utilizes ReRAM as the basic computation cell for vector matrix multiplication (VMM). The execution pipeline is specifically optimized to ensure the efficiency for RNN computation. Regarding the challenges stemmed from ReRAM, this research also explores FeFET to replace ReRAM as the basic memory cell in PIM architecture. A dedicated data communication network, named hierarchical network-on-chip (H-NoC), is presented to enhance the data transmission efficiency. To eliminate the power/area hungry analog-digital conversion (ADC and DAC) in existing PIM architecture and further enhance the efficiency, this research proposes an all-digital, flexible precision PIM design where the computation is performed with dynamical bit-precision. Besides the circuit and architecture optimization, algorithms are developed to fully utilize the hardware potentials. This research proposes a genetic algorithm (GA) based evolutionary method for layer-wise DNN quantization. DNN models can be dynamically quantized and deployed on the developed hardware platforms which support flexible bit-precision to achieve the best computing efficiency without compromising the accuracy. To alleviate the accuracy drop caused by the device (such as ReRAM and FeFET) variation, this research proposes hardware noise aware training algorithm, leading to a reliable PIM engine with un-reliable device.

【 预 览 】
附件列表
Files Size Format View
Energy efficient processing in memory architecture for deep learning computing acceleration 21337KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:9次