| CAAI Transactions on Intelligence Technology | |
| Bayesian estimation-based sentiment word embedding model for sentiment analysis | |
| article | |
| Jingyao Tang1  Yun Xue1  Ziwen Wang1  Shaoyang Hu2  Tao Gong3  Yinong Chen5  Haoliang Zhao1  Luwei Xiao1  | |
| [1] Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University;College of Mathematics and Informatics & College of Software Engineering, South China Agricultural University;School of Foreign Languages, Zhejiang University of Finance & Economics;Educational Testing Service, Princeton;School of Computing, Informatics and Decision Systems Engineering, Arizona State University | |
| 关键词: text analysis; learning (artificial intelligence); probability; Bayes methods; natural language processing; pattern classification; | |
| DOI : 10.1049/cit2.12037 | |
| 学科分类:数学(综合) | |
| 来源: Wiley | |
PDF
|
|
【 摘 要 】
Sentiment word embedding has been extensively studied and used in sentiment analysis tasks. However, most existing models have failed to differentiate high-frequency and low-frequency words. Accordingly, the sentiment information of low-frequency words is insufficiently captured, thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis. A Bayesian estimation-based sentiment word embedding (BESWE) model, which aims to precisely extract the sentiment information of low-frequency words, has been proposed. In the model, a Bayesian estimator is constructed based on the co-occurrence probabilities and sentiment probabilities of words, and a novel loss function is defined for sentiment word embedding learning. The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state-of-the-art methods, for example, C&W, CBOW, GloVe, SE-HyRank and DLJT1, in sentiment analysis tasks, which demonstrate that Bayesian estimation can effectively capture the sentiment information of low-frequency words and integrate the sentiment information into the word embedding through the loss function. In addition, replacing the embedding of low-frequency words in the state-of-the-art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks.
【 授权许可】
CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202302050004881ZK.pdf | 1185KB |
PDF