期刊论文

【摘要】

Hashtags of microblogs can provide valuable information for many natural language processing tasks. How to recommend reliable hashtags automatically has attracted considerable attention. However, existing studies assumed that all the training corpus crawled from social networks are labelled correctly, while large sample statistics on real social media shows that there are 8.9% of microblogs with hashtags having wrong labels. The notable influence of noisy data to the classifier is ignored before. Meanwhile, recency also plays an important role in microblog hashtag, but the information is not used in the existing studies. Some temporal hashtags such as World Cup will ignite at a particular time, but at other times, the number of people talking about it will sharply decrease. To address the twofold shortcomings above, the authors propose an long short-term memory-based model, which uses temporal enhanced selective sentence-level attention to reduce the influence of wrong labelled microblogs to the classifier. Experimental results using a dataset of 1.7 million microblogs collected from SINA Weibo microblogs demonstrated that the proposed method could achieve significantly better performance than the state-of-the-art methods.

【授权许可】

CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND

【预览】

附件列表
Files	Size	Format	View
RO202107100000084ZK.pdf	171KB	PDF	download

CAAI Transactions on Intelligence Technology
Temporal enhanced sentence-level attention model for hashtag recommendation
article
Jun Ma¹ Chong Feng¹ Ge Shi¹ Xuewen Shi¹ Heyang Huang¹
[1] Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, College of Computer Science, Beijing Institute of Technology University
关键词: information retrieval; social networking (online); recommender systems; natural language processing; pattern classification; text analysis; temporal enhanced sentence-level attention model; natural language processing tasks; training corpus; social networks; sample statistics; social media; wrong labels; classifier; microblog hashtag; temporal hashtags; temporal enhanced selective sentence-level attention; wrong labelled microblogs; SINA Weibo microblogs; hashtag recommendation; C6130D Document processing techniques; C6180N Natural language processing; C7210N Information networks; C7250R Information retrieval techniques;
DOI : 10.1049/trit.2018.0012
学科分类：数学（综合）
来源: Wiley
PDF


	文献评价指标
	下载次数：8次	浏览次数：2次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】