会议论文详细信息
The 2nd International Workshop on Adversarial Information Retrieval on the Web
Link-Based Characterization and Detection of Web Spam
计算机科学;图书情报档案学
Luca Becchetti ; Carlos Castillo ; Debora Donato ; Stefano Leonardi ; Ricardo Baeza-Yates
Others  :  http://airweb.cse.lehigh.edu/2006/becchetti.pdf
PID  :  7238
来源: CEUR
PDF
【 摘 要 】

We perform a statistical analysis of a large collection of Web pages, focusing on spam detection. We study several metrics such as degree correlations, number of neighbors, rank propagation through links, TrustRank and others to build several automatic web spam classiers. This paper presents a study of the performance of each of these classiers alone, as well as their combined performance. Using this approach we are able to detect 80.4% of the Web spam in our sample, with only 1.1% of false positives.

【 预 览 】
附件列表
Files Size Format View
Link-Based Characterization and Detection of Web Spam 221KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:6次