期刊论文详细信息
American journal of applied sciences
Features Reweighting and Similarity Coefficient Based Method for Email Spam Filtering
Ali Elsiddig, Ahmed Osman1 
关键词: Spam;    Spam Filtering;    Feature Selection;    Similarity Coefficient;   
DOI  :  10.3844/ajassp.2017.983.993
学科分类:自然科学(综合)
来源: Science Publications
PDF
【 摘 要 】

Spam is flooding the Internet with many copies of the same message, in an attempt to force the message on people who would not otherwise choose to receive it. Anti spam by determining whether or not an incoming email is spam has become an important problem. One of the main characters or the problem of Spam filtering is its high dimension of space feature. For this reason, we need a reducing stage of dimensions. This study tried to cover this side from spam detection techniques by study the effect of re-weight of features. The works started by applying similarity coefficient in the dataset and then re-weight the features in the dataset and applying similarity coefficient in the new data set. Finally make a Comparison between the result before and after re-weight and Comparison with feature selection method. The objective of this Thesis is: Study the similarity coefficient (Cosine and Dice) and Study the effects of the important feature to other features through the re-weight process. The most important results of this study are: Reweighting process did not improve the success rate of any of the two methods (Cosine and Dice). Also, Feature selection method led to improve detection in Cosine, while reweighting method not improve detection any of (Cosine or Dice).

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201902016149693ZK.pdf 605KB PDF download
  文献评价指标  
  下载次数:23次 浏览次数:31次