期刊论文详细信息
CAAI Transactions on Intelligence Technology
Bayesian estimation-based sentiment word embedding model for sentiment analysis
article
Jingyao Tang1  Yun Xue1  Ziwen Wang1  Shaoyang Hu2  Tao Gong3  Yinong Chen5  Haoliang Zhao1  Luwei Xiao1 
[1] Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University;College of Mathematics and Informatics & College of Software Engineering, South China Agricultural University;School of Foreign Languages, Zhejiang University of Finance & Economics;Educational Testing Service, Princeton;School of Computing, Informatics and Decision Systems Engineering, Arizona State University
关键词: text analysis;    learning (artificial intelligence);    probability;    Bayes methods;    natural language processing;    pattern classification;   
DOI  :  10.1049/cit2.12037
学科分类:数学(综合)
来源: Wiley
PDF
【 摘 要 】

Sentiment word embedding has been extensively studied and used in sentiment analysis tasks. However, most existing models have failed to differentiate high-frequency and low-frequency words. Accordingly, the sentiment information of low-frequency words is insufficiently captured, thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis. A Bayesian estimation-based sentiment word embedding (BESWE) model, which aims to precisely extract the sentiment information of low-frequency words, has been proposed. In the model, a Bayesian estimator is constructed based on the co-occurrence probabilities and sentiment probabilities of words, and a novel loss function is defined for sentiment word embedding learning. The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state-of-the-art methods, for example, C&W, CBOW, GloVe, SE-HyRank and DLJT1, in sentiment analysis tasks, which demonstrate that Bayesian estimation can effectively capture the sentiment information of low-frequency words and integrate the sentiment information into the word embedding through the loss function. In addition, replacing the embedding of low-frequency words in the state-of-the-art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks.

【 授权许可】

CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND   

【 预 览 】
附件列表
Files Size Format View
RO202302050004881ZK.pdf 1185KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:4次