The 2nd International Workshop on Adversarial Information Retrieval on the Web | |
Improving Cloaking Detection Using Search Query Popularity and Monetizability | |
计算机科学;图书情报档案学 | |
Kumar Chellapilla ; David Maxwell Chickering | |
Others : http://airweb.cse.lehigh.edu/2006/chellapilla.pdf PID : 7231 |
|
来源: CEUR | |
【 摘 要 】
Cloaking is a search engine spamming technique used by some Web sites to deliver one page to a search engine for indexing while serving an entirely different page to users browsing the site. In this paper, we show that the degree of cloaking among search results depends on query properties such as popularity andmonetizability. We propose estimating query popularity and monetizability by analyzing search engine query logs and online advertising click-through logs, respectively. We also present a new measure for detecting cloaked URLs that uses a normalized term frequency ratio between multiple downloaded copies of Webpages. Experiments are conducted using 10,000 search queries and 3 million associated search result URLs. Experimental results indicate that while only 73.1% of the cloaked popular search URLs are spam, over 98.5% of the cloaked monetizable search URLs are spam. Further, on average, the search results for top 2% most cloaked queries are 10x more likely to be cloaking than those for the bottom 98% of the queries.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Improving Cloaking Detection Using Search Query Popularity and Monetizability | 325KB | download |