NEUROCOMPUTING | 卷:400 |
Visualising basins of attraction for the cross-entropy and the squared error neural network loss functions | |
Article | |
Bosman, Anna Sergeevna1  Engelbrecht, Andries2,3  Helbig, Marde4  | |
[1] Univ Pretoria, Dept Comp Sci, Pretoria, South Africa | |
[2] Stellenbosch Univ, Dept Ind Engn, Stellenbosch, South Africa | |
[3] Stellenbosch Univ, Comp Sci Div, Stellenbosch, South Africa | |
[4] Griffith Univ, Sch Informat & Commun Technol, Southport, Qld, Australia | |
关键词: Fitness landscape analysis; Neural networks; Cross-entropy; Squared error; Local minima; Loss functions; | |
DOI : 10.1016/j.neucom.2020.02.113 | |
来源: Elsevier | |
【 摘 要 】
Quantification of the stationary points and the associated basins of attraction of neural network loss surfaces is an important step towards a better understanding of neural network loss surfaces at large. This work proposes a novel method to visualise basins of attraction together with the associated stationary points via gradient-based stochastic sampling. The proposed technique is used to perform an empirical study of the loss surfaces generated by two different error metrics: quadratic loss and entropic loss. The empirical observations confirm the theoretical hypothesis regarding the nature of neural network attraction basins. Entropic loss is shown to exhibit stronger gradients and fewer stationary points than quadratic loss, indicating that entropic loss has a more searchable landscape. Quadratic loss is shown to be more resilient to overfitting than entropic loss. Both losses are shown to exhibit local minima, but the number of local minima is shown to decrease with an increase in dimensionality. Thus, the proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks. (C) 2020 Elsevier B.V. All rights reserved.
【 授权许可】
Free
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
10_1016_j_neucom_2020_02_113.pdf | 6929KB | download |