Journal of vision | |
Bayesian depth estimation from monocular natural images | |
Che-Chun Su1  Alan C. Bovik1  Lawrence K. Cormack2  | |
[1] Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA;Department of Psychology, The University of Texas at Austin, Austin, TX, USA | |
关键词: maps; luminance; natural scenes; datasets; vision; regression analysis; histogram; | |
DOI : 10.1167/17.5.22 | |
学科分类:眼科学 | |
来源: Association for Research in Vision and Ophthalmology | |
【 摘 要 】
Estimating an accurate and naturalistic dense depth map from a single monocular photographic image is a difficult problem. Nevertheless, human observers have little difficulty understanding the depth structure implied by photographs. Two-dimensional (2D) images of the real-world environment contain significant statistical information regarding the three-dimensional (3D) structure of the world that the vision system likely exploits to compute perceived depth, monocularly as well as binocularly. Toward understanding how this might be accomplished, we propose a Bayesian model of monocular depth computation that recovers detailed 3D scene structures by extracting reliable, robust, depth-sensitive statistical features from single natural images. These features are derived using well-accepted univariate natural scene statistics (NSS) models and recent bivariate/correlation NSS models that describe the relationships between 2D photographic images and their associated depth maps. This is accomplished by building a dictionary of canonical local depth patterns from which NSS features are extracted as prior information. The dictionary is used to create a multivariate Gaussian mixture (MGM) likelihood model that associates local image features with depth patterns. A simple Bayesian predictor is then used to form spatial depth estimates. The depth results produced by the model, despite its simplicity, correlate well with ground-truth depths measured by a current-generation terrestrial light detection and ranging (LIDAR) scanner. Such a strong form of statistical depth information could be used by the visual system when creating overall estimated depth maps incorporating stereopsis, accommodation, and other conditions. Indeed, even in isolation, the Bayesian predictor delivers depth estimates that are competitive with state-of-the-art âcomputer visionâ methods that utilize highly engineered image features and sophisticated machine learning algorithms.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201902194697260ZK.pdf | 6050KB | download |