NEUROCOMPUTING | 卷:461 |
Improving the backpropagation algorithm with consequentialism weight updates over mini-batches | |
Article | |
Paeedeh, Naeem1  Ghiasi-Shirazi, Kamaledin2  | |
[1] Amirkabir Univ Technol, Dept Math & Comp Sci, Tehran, Iran | |
[2] Ferdows Univ Mashhad FUM, Dept Comp Engn, Off BC-123,Azadi Sq, Mashhad, Razavi Khorasan, Iran | |
关键词: Adaptive filters; LMS; NLMS; APA; Backpropagation; SGD; | |
DOI : 10.1016/j.neucom.2021.07.010 | |
来源: Elsevier | |
【 摘 要 】
Normalized least mean squares (NLMS) and affine projection algorithm (APA) are two successful algo-rithms that improve the stability of least mean squares (LMS) by reducing the necessity to change the learning rate during the training process. In this paper, we extend them to multi-layer neural networks. We first prove that it is possible to consider a multi-layer neural network as a stack of adaptive filters. It opens the door to bring successful algorithms from adaptive filters to neural networks. We additionally introduce a more comprehensible interpretation than the complicated geometric interpretation in APA for a single fully-connected (FC) layer that can easily be generalized, for instance, to convolutional neural networks and mini-batch training. With this new viewpoint, we introduce a more robust algorithm by predicting and then amending the adverse consequences of some actions that take place in mini-batch backpropagation (BP), even before they happen. The proposed method is a modification to the BP that can be used alongside stochastic gradient descent (SGD) and its momentum variants like Adam and Nesterov. Our experiments show the usefulness of the proposed method in the training of deep neural networks. It is less sensitive to hyper-parameters and needs less intervention during the training process. Besides, it usually converges more smoothly and in fewer iterations. Such predictable behavior helps it to get tuned easier, be resilient during the training, and reduce or eliminate its reliance on other techniques such as momentum. (c) 2021 Elsevier B.V. All rights reserved.
【 授权许可】
Free
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
10_1016_j_neucom_2021_07_010.pdf | 2211KB | download |