期刊论文详细信息
IEEE Access
Learning Depth Estimation From Memory Infusing Monocular Cues: A Generalization Prediction Approach
Jinkuan Zhu1  Tingyong Wu1  Yakun Zhou1  Jienan Chen1  Musen Hu1  Jinting Luo1  Xingzhong Xiong2 
[1] National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu, China;School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin, China;
关键词: Long short-term memory (LSTM);    monocular depth estimiation;    multi-layer perceptron (MLP);    region proposal network (RPN);   
DOI  :  10.1109/ACCESS.2022.3151108
来源: DOAJ
【 摘 要 】

Depth estimation from a single image is a challenging task, yet this field has a promising prospect in automatic driving and augmented reality. However, the prediction accuracy is degraded significantly when the trained network is transferred from the training dataset to real scenarios. To solve this issue, we propose MonoMeMa, a novel deep architecture based on the human monocular cue, which means humans can perceive depth information with one eye through the relative size of objects, light and shadow, etc. based on previous visual experience. Our method simulates the process of the formation and utilization of human monocular visual memory, including three steps: Firstly, MonoMeMa perceives and extracts real-world objects feature vectors (encoding). Then, it maintains and replaces the extracted feature vector over time (storing). Finally, MonoMeMa combines query objects feature vectors and memory to inference depth information (retrieving). According to the simulation results, our model shows the state-of-the-art results on the KITTI driving dataset. Moreover, MonoMema exhibits remarkable generalization performance when our model is migrated to other driving datasets without any finetune.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次