学位论文详细信息
Trainability and generalization of small-scale neural networks
Deep Learning;Neural Networks;Learning Theory
Song, Myung Hwan ; Sun ; Ruoyu
关键词: Deep Learning;    Neural Networks;    Learning Theory;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/104821/SONG-THESIS-2019.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

As deep learning has become solution for various machine learning, artificial intelligence applications, their architectures have been developed accordingly. Modern deep learning applications often use overparameterized setting, which is opposite to what conventional learning theory suggests. While deep neural networks are considered to be less vulnerable to overfitting even with their overparameterized architecture, this project observed that properly trained small-scale networks indeed outperform its larger counterparts. The generalization ability of small-scale networks has been overlooked in many researches and practice, due to their extremely slow convergence speed. This project observed that imbalanced layer-wise gradient norm can hider overall convergence speed of neural networks, and narrow networks are vulnerable to this. This projects investigates possible reasons of convergence failure of small-scale neural networks, and suggests a strategy to alleviate the problem.

【 预 览 】
附件列表
Files Size Format View
Trainability and generalization of small-scale neural networks 931KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:56次