Современные информационные технологии и IT-образование | |
Stabilizing Elastic Weight Consolidation Method in Practical ML Tasks and Using Weight Importance’s for Neural Network Pruning | |
Alexey Kutalev1  Alisa Lapina1  | |
[1] PJSC "Sberbank of Russia", Moscow, Russia; | |
关键词: neural network; catastrophic forgetting; elastic weight consolidation; back propagation; neural network pruning; | |
DOI : 10.25559/SITITO.17.202102.345-354 | |
来源: DOAJ |
【 摘 要 】
This work focuses on the practical application of ElasticWeightConsolidation, EWC for sequential training of neural networks on several training sets. In it, we will more rigorously compare the well-known methodologies for calculating the importance of weights used in the method of fixing weights. These are the MemoryAwareSynapses (MAS), SynapticIntelligence (SI) methodologies and the calculation of the importance of weights based on the Fisher information matrix from the original work on EWC. We will review these methodologies in the application to deep neural networks with fully connected and matched layers, find optimal hyperparameters for each of the methodologies, and compare the results of sequential learning of the neural network when using them. Next, we will point out the problems that arise when applying the method of elastic weight pinning in deep neural networks with convolutional layers and self-attention layers, such as the “explosion of gradients” and the loss of significant information in the gradient when using its norm constraint (gradientclipping). Then, we will propose a method for stabilizing the elastic weight fixing method that helps to solve these problems, evaluate this method in comparison with the original methodology, and show that the proposed stabilization method copes with the task of retaining skills in sequential training no worse than the original EWC, but, at the same time, does not have its disadvantages. In conclusion, it is interesting to note the use of different types of weights in the neural network’s pruning problem.
【 授权许可】
Unknown