学位论文详细信息
Scene understanding with complete scenes and structured representations
Scene Understanding;Computer Vision;Machine Learning;Computer Graphics;Image Parsing;Image Segmentation;RGB-D images
Guo, Ruiqi
关键词: Scene Understanding;    Computer Vision;    Machine Learning;    Computer Graphics;    Image Parsing;    Image Segmentation;    RGB-D images;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/50564/Ruiqi_Guo.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Humans can understand scenes with abundant detail: they see layouts, surfaces, the shape of objects among other details. By contrast, many machine-based scene analysis algorithms use simple representation to parse scenes, mainly bounding boxes and pixel labels, and apply only to visible regions. We believe we should move to deeper levels of scene analysis, embracing more a comprehensive, structured representation.In this dissertation, we focus on analyzing scenes to their complete extent and structured details. First off, our work uses a structured representation that is closer to human interpretation, with a mixture of layout, functional objects and clutter. We developed annotation tools and collected a dataset of 1449 rooms annotated in detailed 3D models.Another feature of our work is that we understand scenes to their complete extent, even parts of them beyond the line of the sight. We present a simple framework to detect visible portion with appearance-based models and then infer the occluded portion with a contextual approach. We integrate contexts from surrounding regions, the spatial prior and shape regularity of background surfaces. Our method is applicable to 2D images, and can also be used to infer support surfaces in 3D scenarios. Our complete surface prediction quantitatively out-performs relevant baselines, especially when they are occluded.Finally, we present a system that interprets from single-view RGB-D images of indoor scenes into our proposed representation. Such a scene interpretation is useful for robotics and visual reasoning but difficult to produce due to the well-known challenge of segmenting objects, the high degree of occlusion, and the diversity of objects in indoor scenes. We take a data-driven approach, generating sets of potential object regions, matching them with regions in training images, and transferring and aligning associated 3D models while encouraging them to be consistent with observed depths. To the best of our knowledge, this is the first automatic system capable of interpreting scenes into 3D models with similar levels of detail.

【 预 览 】
附件列表
Files Size Format View
Scene understanding with complete scenes and structured representations 27430KB PDF download
  文献评价指标  
  下载次数:89次 浏览次数:47次