学位论文详细信息
Grounding natural language phrases in images and video
Computer Vision, Natural Language Processing, Phrase Grounding
Plummer, Bryan A.
关键词: Computer Vision, Natural Language Processing, Phrase Grounding;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/100977/PLUMMER-DISSERTATION-2018.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Grounding language in images has shown it can help improve performance on many image-language tasks. To spur research on this topic, this dissertation introduces a new dataset which provides the ground truth annotations of the location of noun phrase chunks in image captions.I begin by introducing a constituent task termed phrase localization, where the goal is to localize an entity known to exist in an image when provided with a natural language query.To address this task, I introduce a model which learns a set of models, each of which capture a different concept which is useful in our task.These concepts can be predefined, such as attributes gleamed from the adjectives, as well as those which are automatically learned in a single-end-to-end neural network.I also address the more challenging detection style task, where the goal is to localize a phrase and determine if it is associated with an image.Multiple applications of the models presented in this work demonstrate their value beyond the phrase localization task.

【 预 览 】
附件列表
Files Size Format View
Grounding natural language phrases in images and video 8998KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:4次