学位论文

【摘要】

Estimating the 3D structure of a scene and recognizing scene elements are two kernel functions supporting many artificial intelligence applications. The ability to achieve these two goals only using RGB images is very valuable to a low-cost system but also extremely challenging. A scene may comprise a large number of different points, regions, and objects. Identifying their existence and distinguishing their semantic properties using RGB images are related to two research topics in computer vision: geometric scene understanding and semantic scene understanding. Over the past decades, many researchers were devoted into solving the problem of geometric scene understanding such as the works in camera calibration, structure-from-motion, and dense reconstruction. Meanwhile, numerous other researchers studied the problem of semantic scene understanding including the works in object recognition, region segmentation, and layout estimation. However, these efforts of disjointly solving the geometric or the semantic understanding problem usually lead to limited estimation capability and recognition accuracy.In this thesis, I will propose a novel image-based framework to jointly solve the geometric and semantic scene understanding problems, which includes the complete process of recognizing elements in a scene, estimating their spatial properties, and identifying their mutual relationships. Recognizing components in a scene provides constraints to estimate the geometric structure of the scene, while the estimated geometric structure in turn greatly helps the recognition task by providing contextual information and pruning out impossible configurations of scene components. Experiments proved that, by jointly solving the geometric understanding and semantic understanding problems, the two can be solved with an accuracy significantly higher than solving them separately.

【预览】

附件列表
Files	Size	Format	View
Geometric and Semantic Scene Understanding.	35073KB	PDF	download


Geometric and Semantic Scene Understanding.
Computer Vision;Object Detection;3D Reconstruction;Structure from Motion;Layout Estimation;Computer Science;Engineering;Electrical Engineering: Systems
Bao, YingzeHoiem, Derek W. ;
University of Michigan
关键词: Computer Vision; Object Detection; 3D Reconstruction; Structure from Motion; Layout Estimation; Computer Science; Engineering; Electrical Engineering: Systems;
Others : https://deepblue.lib.umich.edu/bitstream/handle/2027.42/107131/yingze_1.pdf?sequence=1&isAllowed=y
瑞士\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：43次	浏览次数：35次

【 摘 要 】

【 预 览 】

【摘要】

【预览】