2019 International Conference on Advanced Electronic Materials, Computers and Materials Engineering | |
A Study of the Chinese spam Classification with Doc2vec and CNN | |
无线电电子学;计算机科学;材料科学 | |
Gong, Hechen^1 ; You, Fucheng^1 ; Wang, Shaomei^1 | |
School of Information Engineering, Beijing Institute of Graphic Communication, Beijing | |
102600, China^1 | |
关键词: Chinese spam; Chinese text; Classification results; Convolution neural network; Granularity levels; Hotspots; NAtural language processing; Word vectors; | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/563/4/042026/pdf DOI : 10.1088/1757-899X/563/4/042026 |
|
来源: IOP | |
![]() |
【 摘 要 】
Convolution neural network is a kind of neural network, which has been proved to be very effective in image recognition and classification. In recent years, convolution neural networks have gradually shifted to the field of natural language processing and become one of the research hotspots. For the construction of word vector text using convolution neural network, only considering the relationship between word granularity level, not considering the relationship between words, nor considering the relationship between semantics, affecting the classification results. In this paper, a method based on Doc2vec and CNN is proposed to classify spam. Firstly, the spam is preprocessed, then the sentence vectors and word vectors of Chinese text are trained by Doc2vec, and finally the trained text vectors are classified by convolution neural network.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
A Study of the Chinese spam Classification with Doc2vec and CNN | 676KB | ![]() |