期刊论文

【摘要】

Depth estimation from a single image is a challenging task, yet this field has a promising prospect in automatic driving and augmented reality. However, the prediction accuracy is degraded significantly when the trained network is transferred from the training dataset to real scenarios. To solve this issue, we propose MonoMeMa, a novel deep architecture based on the human monocular cue, which means humans can perceive depth information with one eye through the relative size of objects, light and shadow, etc. based on previous visual experience. Our method simulates the process of the formation and utilization of human monocular visual memory, including three steps: Firstly, MonoMeMa perceives and extracts real-world objects feature vectors (encoding). Then, it maintains and replaces the extracted feature vector over time (storing). Finally, MonoMeMa combines query objects feature vectors and memory to inference depth information (retrieving). According to the simulation results, our model shows the state-of-the-art results on the KITTI driving dataset. Moreover, MonoMema exhibits remarkable generalization performance when our model is migrated to other driving datasets without any finetune.

【授权许可】

Unknown

IEEE Access
Learning Depth Estimation From Memory Infusing Monocular Cues: A Generalization Prediction Approach

Jinkuan Zhu¹ Tingyong Wu¹ Yakun Zhou¹ Jienan Chen¹ Musen Hu¹ Jinting Luo¹ Xingzhong Xiong²
[1] National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu, China;School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin, China;
关键词: Long short-term memory (LSTM); monocular depth estimiation; multi-layer perceptron (MLP); region proposal network (RPN);
DOI : 10.1109/ACCESS.2022.3151108
来源: DOAJ


	文献评价指标
	下载次数：0次	浏览次数：0次

【 摘 要 】

【 授权许可】

【摘要】

【授权许可】