学位论文详细信息
Visual questioning agents
Visual Question Generation, Visual Dialog, Variational Autoencoders, Language and Vision, Computer Vision
Jain, Unnat ; Lazebnik ; Svetlana ; Schwing ; Alexander Gerhard
关键词: Visual Question Generation, Visual Dialog, Variational Autoencoders, Language and Vision, Computer Vision;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/101574/JAIN-THESIS-2018.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Curious questioning or the ability to inquire about surrounding environment or additional context, is an important step towards building agents which go beyond learning from a static knowledge base. The ability to request feedback is the first step in building intelligent agents which can incorporate this feedback to enhance learning. Visual questioning tasks help model this human skill of “curiosity.” In this thesis, we focus on two relevant vision based questioning tasks – visual question generation and visual dialog. We propose novel approaches and evaluation metrics for these tasks. For visual question generation, we combined language models with variational autoencoders to enhance diversity in text generations. We also suggest diversity metrics to quantify these improvements. For visual dialog, we introduce a reformulated dataset to enable training of questioning agents in a dialog setup. We also introduce simpler and more effective baselines for the task. Our combined results in visual question generation and visual dialog contribute to establishing visual questioning as an important next step for computer vision, and more generally, for artificial intelligence.

【 预 览 】
附件列表
Files Size Format View
Visual questioning agents 4825KB PDF download
  文献评价指标  
  下载次数:41次 浏览次数:58次