Low-power, high-speed neural networks are critical for providing deployable embedded AIapplications at the edge. We describe a Xilinx FPGA implementation of Neural EngineeringFramework (NEF) networks with online learning that outperforms mobile Nvidia GPUimplementations by an order of magnitude or more. Specifically, we provide an embeddedPython-capable PYNQ FPGA implementation supported with a Xilinx Vivado High-LevelSynthesis (HLS) workflow that allows sub-millisecond implementation of adaptive neuralnetworks with low-latency, direct I/O access to the physical world. The outcome of thiswork is NengoFPGA, a seamless and user-friendly extension to the neural compiler Pythonpackage Nengo. To reduce memory requirements and improve performance we tune theprecision of the different intermediate variables in the code to achieve competitive absoluteaccuracy against slower and larger floating-point reference designs. The online learningcomponent of the neural network exploits immediate feedback to adjust the network weightsto best support a given arithmetic precision. As the space of possible design configurationsof such quantized networks is vast and is subject to a target accuracy constraint, we usethe Hyperopt hyper-parameter tuning tool instead of manual search to find Pareto optimaldesigns. Specifically, we are able to generate the optimized designs in under 500 shortiterations of Vivado HLS C synthesis before running the complete Vivado place-and-routephase on that subset, a much longer process not conducive to rapid exploration. For neuralnetwork populations of 64–4096 neurons and 1–8 representational dimensions our optimizedFPGA implementation generated by Hyperopt has a speedup of 10–484× over a competingcuBLAS implementation on the Jetson TX1 GPU while using 2.4–9.5× less power. Ourspeedups are a result of HLS-specific reformulation (15× improvement), precision adaptation(3× improvement), and low-latency direct I/O access (1000× improvement).
【 预 览 】
附件列表
Files
Size
Format
View
NengoFPGA: an FPGA Backend for the Nengo Neural Simulator