Beschreibung |
Some websites on the Internet have just one purpose: to make you click on an affiliate link to a sales page, which, in turn, earns the affiliated partner a share of the sale. We call such websites Affiliate Spam. Since it is purely accidental if these websites fulfill any information need, they rely on SEO and abuse recommendation engines to attract visitors. We believe that affiliate spam should be put in its place, which is much lower in the ranking of a search engine.
In this project, we will start with detecting offending websites, hence Affiliate Spam Detection. Goals: Find affiliate spam in the CommonCrawl.
Find affiliate Spam on Google. Then. find features to detect them, like links to Amazon, affiliate signatures on links, SEO compliance, main content features. |