期刊论文详细信息
IAENG Internaitonal journal of computer science
Latent Semantic Analysis using a Dennis Coefficient for English Sentiment Classification in a Parallel System
Phu Vo Ngoc1 
[1] Institute of Research and Development,Duy Tan University-DTUDa Nang, Vietnam
关键词: English sentiment classification;    parallel system;    Cloudera;    Hadoop Map and Hadoop Reduce;    Dennis Measure;    Latent Semantic Analysis;   
DOI  :  10.15837/ijccc.2018.3.3044
学科分类:计算机科学(综合)
来源: International Association of Engineers
PDF
【 摘 要 】

We have already survey many significant approaches for many years because there are many crucial contributions of the sentiment classification which can be applied in everyday life, such as in political activities, commodity production, and commercial activities. We have proposed a novel model using a Latent Semantic Analysis (LSA) and a Dennis Coefficient (DNC) for big data sentiment classification in English. Many LSA vectors (LSAV) have successfully been reformed by using the DNC. We use the DNC and the LSAVs to classify 11,000,000 documents of our testing data set to 5,000,000 documents of our training data set in English. This novel model uses many sentiment lexicons of our basis English sentiment dictionary (bESD). We have tested the proposed model in both a sequential environment and a distributed network system. The results of the sequential system are not as good as that of the parallel environment. We have achieved 88.76% accuracy of the testing data set, and this is better than the accuracies of many previous models of the semantic analysis. Besides, we have also compared the novel model with the previous models, and the experiments and the results of our proposed model are better than that of the previous model. Many different fields can widely use the results of the novel model in many commercial applications and surveys of the sentiment classification.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201904282865701ZK.pdf 497KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:12次