HomeOverviewSign upFAQVarious

 Version Française




Anti-spam : how to get rid of referer spam

They love to fill our server logs with links pointing to their websites to improved their page ranking. As they like to eat our bandwidth, we will eat theirs.

Introduction

Referer (or referrer) spam is described by Wikipedia as "a kind of spamdexing (spamming aimed at search engines). The technique involves making repeated web site requests using a fake referrer url that points to the site the spammer wishes to advertise. Sites that publicize their access logs, including referrer statistics, will then end up linking to the spammer's site, which will in turn be indexed by the search engines as they crawl the access logs.
This benefits the spammer because of the free link, and also gives the spammer's site improved search engine placement due to link-counting algorithms that search engines use.
"

If you have a server you are probably facing that problem. Fortunately, if spammers can bother us, we too can bother them and make their life a little harder.

Example

Here is an example of referer spam (found in spamCle@ner logs) for a French website : europe-backup.com

  62.210.213.254 - - [06/Sep/2007:03:21:30 +0200] "GET / HTTP/1.0" 200 6795
  "http://www.europe-backup.com/archives/?r=57588,60,11"
  "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6"

  62.210.233.68 - - [06/Sep/2007:04:19:40 +0200] "GET / HTTP/1.0" 200 6795
  "http://www.europe-backup.com/archives/?r=57588,60,11"
  "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6"

If you follow the above mentioned address, you won't find any link poiting to spamCle@ner on that page, which is a typical characteristic of this kind of method, but also a proof that Cisneo Ltd. is involved in referer spam. We can also have a look at those 2 IPs ReverseDNS :

  62.210.213.254 = Brrrrrrrrrrrr.kisscool.be
  62.210.233.68 = Kiritimati.ath.cx

Solutions

We will simply use apache mod_rewrite module to get rid of referer spams. We will not only block them but we are also going to play with them : we will send them back to where they came from so that each time they will try to eat our bandwidth we will make them eat their own one. Finally, because it's the most important part, we will do exactly the opposite of what they are expecting : we will not keep any record of their visit in our server logs.

  • Targeted example
  • In that first example, we know the spammer's website : europe-backup.com.
    You can either create a .htaccess file in your website root directory or write the following lines in your apache website configuration file :

       RewriteEngine on
       RewriteCond %{HTTP_REFERER} ^http://www.europe-backup.com [NC]
       RewriteRule ^.* http://www.europe-backup.com [L,E=nolog:1]
    

    1st line: activate mod_rewrite.
    2nd line: look for a "http://www.europe-backup.com" referer. [NC] means case-insensitive.
    3rd line: we replace our website URL (^.*) by a redirection to the spammer website. [L] means that this is the only rule to apply. The last instruction, [E=nolog:1], is very important : it creates and sets a variable (nolog=1) which will be used by apache.

    In your apache configuration file you should have a line almost identical to this one :

      CustomLog logs/access_log combined
    

    Just add at the end of the line the 'env=!nolog' instruction to tell apache not to log anything when the 'env' variable is set :

      CustomLog logs/access_log combined env=!nolog
    

    You can restart apache and now, each time that company will try to spam you, it will be sent back to its own website (and use its own bandwith), and apache will not write anything to its log.

  • Untargeted example
  • What to do when, unlike the previous example, we do not know the spammer domain name in advance ? That is not a problem because we still can redirect the spammer to his own website as he will be kind enough to give us his URL in the HTTP_REFERER variable. That way you can get rid of a lot of porn / pharmacy and casino websites :

      RewriteCond %{HTTP_REFERER} (poker|casino|pharma|mortgage|viagra|porn) [NC]
      RewriteRule ^.* %{HTTP_REFERER} [L,E=nolog:1]
    

    1st line: look for a referer URL with the above spam names/patterns.
    2nd line: redirect the spammer to his own URL found in the HTTP_REFERER variable.

    And of course, modifiy apache conf file so it will not log the request :

      CustomLog logs/access_log combined env=!nolog
    

  • Various examples
  • If you prefer that the spammer lose his time rather than his bandwidth, increase indefinitely the response time to his request by redirecting it to localhost (127.0.0.1) or an IANA's Black Hole :

      RewriteCond %{HTTP_REFERER} (poker|casino|pharma|mortgage|viagra|porn) [NC]
      RewriteRule ^.* http://127.0.0.1 [L,E=nolog:1]
    

    or :

      RewriteCond %{HTTP_REFERER} (poker|casino|pharma|mortgage|viagra|porn) [NC]
      RewriteRule ^.* http://prisoner.iana.org/ [L,E=nolog:1]
    

    You may be willing to redirect the spammer to a huge file, say 10Mb :

      RewriteCond %{HTTP_REFERER} ^http://foo.bar [NC]
      RewriteRule ^.* http://foo.bar/big_file.pdf [L,E=nolog:1]
    

    Or maybe he has a contact form on his website ? Send him back there with the form variables in the URL so that each time he will visit you, you will redirect him to his contact form using his own bandwith and he will automatically post a message to himself (using his own IP address) :

      RewriteCond %{HTTP_REFERER} ^http://foo.bar [NC]
      RewriteRule ^.* http://foo.bar/contact.php?name=me&message=no%20spam%20thanks [L,E=nolog:1]
    

    Finally, once everything is setup on your server, test your rules to ensure they are working well :

       # wget -S --referer='http://site_viagra.com' http://you.domain.tld
    


    No doubt you will find any similar or better ideas to have fun with spammers !