A number of recent researches focus on designing accelerators for popular deep learning algorithms. Most of these algorithms heavily involve matrix multiplication. As a result, building a neural processing unit (NPU) beside the CPU to accelerate matrix multiplication is a popular approach. The NPU helps reduce the work done by the CPU, and often operates in parallel with the CPU, so in general, introducing the NPU gains performance. Furthermore, the NPU itself can be accelerated due to the fact that the majority operation in the NPU is multiply-add. As a result, in this project, we propose two methods to accelerate the NPU: (1) Replace the digital multiply-add unit in the NPU with time-domain analog and digital mixed-signal multiply-add unit. (2) Replace the multiply-add operation with a CRC hash table lookup. The results show that the first proposed method is not as competitive because of the long delay and high energy consumption of the unit. The second method is more promising in that it improves the energy by 1.96× with accuracy drop within 1.2%.