会议论文详细信息
The 2nd International Workshop on Adversarial Information Retrieval on the Web | |
Link-Based Characterization and Detection of Web Spam | |
计算机科学;图书情报档案学 | |
Luca Becchetti ; Carlos Castillo ; Debora Donato ; Stefano Leonardi ; Ricardo Baeza-Yates | |
Others : http://airweb.cse.lehigh.edu/2006/becchetti.pdf PID : 7238 |
|
来源: CEUR | |
【 摘 要 】
We perform a statistical analysis of a large collection of Web pages, focusing on spam detection. We study several metrics such as degree correlations, number of neighbors, rank propagation through links, TrustRank and others to build several automatic web spam classiers. This paper presents a study of the performance of each of these classiers alone, as well as their combined performance. Using this approach we are able to detect 80.4% of the Web spam in our sample, with only 1.1% of false positives.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Link-Based Characterization and Detection of Web Spam | 221KB | download |