| 2019 The 5th International Conference on Electrical Engineering, Control and Robotics | |
| Short Text Classification Improved by Feature Space Extension | |
| 无线电电子学;计算机科学 | |
| Li, Yanxuan^1 | |
| School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China^1 | |
| 关键词: Convolutional neural network; Feature space; Latent dirichlet allocations; Mobile Internet; Movie reviews; Semantic information; Short text classifications; Validation results; | |
| Others : https://iopscience.iop.org/article/10.1088/1757-899X/533/1/012046/pdf DOI : 10.1088/1757-899X/533/1/012046 |
|
| 学科分类:计算机科学(综合) | |
| 来源: IOP | |
PDF
|
|
【 摘 要 】
With the explosive development of mobile Internet, short text has been applied extensively. The difference between classifying short text and long documents is that short text is of shortness and sparsity. Thus, it is challenging to deal with short text classification owing to its less semantic information. In this paper, we propose a novel topic-based convolutional neural network (TB-CNN) based on Latent Dirichlet Allocation (LDA) model and convolutional neural network. Comparing to traditional CNN methods, TB-CNN generates topic words with LDA model to reduce the sparseness and combines the embedding vectors of topic words and input words to extend feature space of short text. The validation results on IMDB movie review dataset show the improvement and effectiveness of TB-CNN.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| Short Text Classification Improved by Feature Space Extension | 502KB |
PDF