期刊论文详细信息
Jisuanji kexue
Coherent Semantic Spatial-Temporal Attention Network for Video Inpainting
LIU Lang, LI Liang, DAN Yuan-hong1 
[1] College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China;
关键词: video inpainting|image inpainting|spatial-temporal attention|feature loss|vgg loss;   
DOI  :  10.11896/jsjkx.200600130
来源: DOAJ
【 摘 要 】

Existing video inpainting methods usually produce blurred texture,distorted structure and artifacts,while applying image-based inpainting model directly to the video inpainting will lead to inconsistent time.From the perspective of time,a novel coherent semantic spatial-temporal attention(CSSTA) for video inpainting is proposed,through the attention layer,the model focuses on the information that the target frame is partially blocked and the adjacent frames are visible,so as to obtain the visible content to fill the hole region of the target frame.The CSSTA layer can not only model the semantic correlation between hole features but also remotely correlate the long-range information with the hole regions.In order to complete semantically coherent hole regions,a novel loss function Feature Loss is proposed to replace VGG Loss.The model is built on a two-stage coarse-to-fine encoder-decoder model for collecting and refining information from adjacent frames.Experimental results on the YouTube-VOS and DAVIS datasets show that the method in this paper runs almost in real-time and outperforms the three typical video inpainting methods in terms of inpainting results,peak signal-to-noise ratio (PSNR) and structural similarity (SSIM).

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次