期刊论文详细信息
CAAI Transactions on Intelligence Technology
Learning DALTS for cross-modal retrieval
article
Zheng Yu1  Wenmin Wang1 
[1] School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University
关键词: information retrieval;    text analysis;    natural language processing;    image segmentation;    image retrieval;    recurrent neural nets;    cross-modal retrieval;    DALTS;    domain-adaptive limited text space;    image space;    Flickr8K;    Flickr30K;    MSCOCO;    text space features;    B6135 Optical;    image and video signal processing;    C5260B Computer vision and image processing techniques;    C5290 Neural computing techniques;    C6130D Document processing techniques;    C6180N Natural language processing;    C7250R Information retrieval techniques;   
DOI  :  10.1049/trit.2018.1051
学科分类:数学(综合)
来源: Wiley
PDF
【 摘 要 】

Cross-modal retrieval has been recently proposed to find an appropriate subspace, where the similarity across different modalities such as image and text can be directly measured. In this study, different from most existing works, the authors propose a novel model for cross-modal retrieval based on a domain-adaptive limited text space (DALTS) rather than a common space or an image space. Experimental results on three widely used datasets, Flickr8K, Flickr30K and Microsoft Common Objects in Context (MSCOCO), show that the proposed method, dubbed DALTS, is able to learn superior text space features which can effectively capture the necessary information for cross-modal retrieval. Meanwhile, DALTS achieves promising improvements in accuracy for cross-modal retrieval compared with the current state-of-the-art methods.

【 授权许可】

CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO202107100000062ZK.pdf 323KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:4次