期刊论文详细信息
Computational Visual Media
Automatic object annotation in streamed and remotely explored large 3D reconstructions
Hannes Kaufmann1  Benjamin Höller1  Annette Mossel1 
[1] Institute of Visual Computing and Human-Centered Technology, Vienna University of Technology, Favoritenstraße 9-11/193/06, A-1040, Vienna, Austria;
关键词: dense 3D reconstruction;    object detection;    CNN;    distributed virtual reality;   
DOI  :  10.1007/s41095-020-0194-4
来源: Springer
PDF
【 摘 要 】

We introduce a novel framework for 3D scene reconstruction with simultaneous object annotation, using a pre-trained 2D convolutional neural network (CNN), incremental data streaming, and remote exploration, with a virtual reality setup. It enables versatile integration of any 2D box detection or segmentation network. We integrate new approaches to (i) asynchronously perform dense 3D-reconstruction and object annotation at interactive frame rates, (ii) efficiently optimize CNN results in terms of object prediction and spatial accuracy, and (iii) generate computationally-efficient colliders in large triangulated 3D-reconstructions at run-time for 3D scene interaction. Our method is novel in combining CNNs with long and varying inference time with live 3D-reconstruction from RGB-D camera input. We further propose a lightweight data structure to store the 3D-reconstruction data and object annotations to enable fast incremental data transmission for real-time exploration with a remote client, which has not been presented before. Our framework achieves update rates of 22 fps (SSD Mobile Net) and 19 fps (Mask RCNN) for indoor environments up to 800 m3. We evaluated the accuracy of 3D-object detection. Our work provides a versatile foundation for semantic scene understanding of large streamed 3D-reconstructions, while being independent from the CNN’s processing time. Source code is available for non-commercial use.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202107022553059ZK.pdf 1346KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:1次