学位论文详细信息
Deep Neural Networks for Visual Reasoning, Program Induction, and Text-to-Image Synthesis.
neural networks;Computer Science;Engineering;Computer Science & Engineering
Reed, ScottProvost, Emily Kaplan Mower ;
University of Michigan
关键词: neural networks;    Computer Science;    Engineering;    Computer Science & Engineering;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/135763/reedscot_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Deep neural networks excel at pattern recognition, especially in the setting of large scale supervised learning. A combination of better hardware, more data, and algorithmic improvements have yielded breakthroughs in image classification, speech recognition and other perception problems. The research frontier has shifted towards the weak side of neural networks: reasoning, planning, and (like all machine learning algorithms) creativity. How can we advance along this frontier using the same generic techniques so effective in pattern recognition; i.e. gradient descent with backpropagation? In this thesis I develop neural architectures with new capabilities in visual reasoning, program induction and text-to-image synthesis. I propose two models that disentangle the latent visual factors of variation that give rise to images, and enable analogical reasoning in the latent space. I show how to augment a recurrent network with a memory of programs that enables the learning of compositional structure for more data-efficient and generalizable program induction. Finally, I develop a generative neural network that translates descriptions of birds, flowers and other categories into compelling natural images.

【 预 览 】
附件列表
Files Size Format View
Deep Neural Networks for Visual Reasoning, Program Induction, and Text-to-Image Synthesis. 14034KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:13次