期刊论文

【摘要】

In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance.

【授权许可】

CC BY

【预览】

附件列表
Files	Size	Format	View
RO202112149411678ZK.pdf	7300KB	PDF	download

Computational intelligence and neuroscience
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

Zhiwei Zhong¹ Maojun Zhang¹ Xiangrong Zeng¹ Yan Liu¹
[1] School of Systems Engineering, National University of Defense Technology, Changsha 410073, China, nudt.edu.cn;
DOI : 10.1155/2021/5790608
来源: Hindawi Publishing Corporation
PDF


	文献评价指标
	下载次数：0次	浏览次数：2次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】