Referrer Spam Should Be A Crime

...and punished along the lines or "regular" e-mail spam. More to the point, the people who develop and actively market software to do it should be forbidden to do so.

I've thought long and hard before posting this (because it can be used as advertising by the idiots who foster this kind of activity), but in the end I think it might be a wake-up call for the vast majority of people who either own sites but don't know exactly what's happening, or who are fundamentally fed up with having their server resources wasted by these idiots.

The Nuisance, And Why It Doesn't Work

Referrer Spam, besides annoying server operators to no end due to its waste of CPU and disk space, is being actively fought by the search engine developers. Even though it was initially effective to boost a site's ranking, its practical effects are now virtually nil, because the only thing they're doing is feeding blacklists. Results may show up on search engines for a while, but they're quickly removed as soon as admins (and automated tools) get wise to them.

And it doesn't work here because I excluded the Referrers page from search engines and actively block any User-Agent that behaves suspiciously. Anyone, no matter what they say they are, where they are from and whatever parts of my site they try to access.

It's taken up quite a bit of my time, but I have an active ban list that keeps track of these nuisances and, in the more extreme cases, drops me a note with a pre-written e-mail complaint targeted at the abuse address listed in the pertinent WHOIS record.

I've occasionally blocked entire netblocks, which when coupled with a suitable message to contact their ISP tech support, usually brings my abuse message to their attention. It's tough on the ISP tech support guys, but it works.

Who Profits From It, And The Marie Celeste Effect

But let's get back to search engines, which is what stupid marketeers and sleazy site owners are trying to subvert. It's obvious that, after a while, these tactics don't just stop working. The eventually turn search engines and legitimate site admins against these idiots.

So, as search engines start discarding domains, spammers go out and register a new one - or, in the most extreme cases, simply register for hosting elsewhere and setup HTTP redirects from one site to the other (see the Kelkoo post below for more). And that means paying for a new domain registration and/or a new hosting fee, which, as things go these days, is pretty damn cheap (we're talking US$50-75 for both if you use smaller, cheaper registrars and bargain hosting services - i.e., pocket money).

DNS registrars and ISPs across the globe are getting domain name registrations in batches of fifty or more. Which doesn't give them any reason to deny them, since it's good business in a time of shrinking margins. And it gives rise to lots of "phantom" hosting, with hundreds of virtual sites hosting the exact same redirection script to send traffic elsewhere.

Know Thine Enemy

But I'm fed up with it, and I guess that most people hosting sites with limited resources (which is what this is, since it's hanging off my home ADSL line) feel the same. So I decided to take a look at the other people who foster and profit from this practice: The dishonest software developers.

A case in point: I keep getting fake referrers to a site that sells software for referral spamming (no, it's not hyperlinked. it's easy enough to find if you are stupid enough to think you need something like that to boost your Google rankings).

The software in question is one of the most sleazy and questionable pieces of unadulterated shit (yes, that is an actual four-letter-word I've just typed, but I promise not to follow that route) I've ever seen, in the league of the Visual Basic mass e-mail spamming tools I saw back in the late nineties (and which regularly led to cancellation of accounts in the ISP where I worked).

It looks like this (this is their actual "marketing" screenshot):

Note the tabs: It lets you fake every single aspect of an HTTP transaction, from the User-Agent involved to the set of referrers to spam, including options for proxy use and URL harvesting. And it is marketed like this -

(piece of crap) is a Windows-based mass referrer spammer tool, which means that it will make a connection to a list of URLs with any referrer URL and User-Agent that you specify. This accomplishes several things. Firstly, it generates webmaster traffic from webmasters checking their referral statistics. Secondly, it boosts your link popularity and thereby your Google PR, because a lot of sites have public referral stats with linked entries.

Which is a blatant lie, since Google is all too aware of this. Also, small admins have long since wised up and (like myself) removed all their statistics pages from the search engines.

(snotty software) is extremely fast in operation because instead of actually downloading entire websites, it sends a customized HTTP header for which it receives a small response. On a modern computer with any kind of broadband, the average number of websites hit per second usually stabilize around 60 to 100.

80 websites per second. Or 80 fake referrers per second, which provides ample reach for any idiot who uses it. And it does not use HEAD requests (no, those aren't good enough). What this loathsome thing does is to issue a full GET request and forcefully close the socket (which means that if your server generates pages dynamically, it will burn CPU to generate the whole page and have nowhere to send it to, which may cause a whole new set of problems).

(piece of excrement) has plenty of options and features, among other things an URL harvester which eases your job of collecting URLs to referral market to. The URL harvester just needs you to give it one or more URLs from which it can extract out URLs to the main list. (smelly thing) also sports rotating User-Agents, proxies, and even referrers.

So it's virtually impossible for an ISP to figure out what is happening unless you start looking for very small HTTP transactions coming from a user (which, like all manners of traffic inspection, soon leads to madness). Some IDS solutions do just that, but they're made to keep bad traffic outside, not police their users (unless you're a corporate, but that's another story). So, if anyone comes up with the bright idea of stopping this at the ISP level, their next clueless notion will be looking at placing big, fat, expensive (and, by the way, technically unfeasible) boxes at peering points to inspect traffic.

My guess is that spammers will start using HTTPS next. That way madness lies, since the SSL handshake alone will bring your average server to its knees if it starts happening in batches of 80 or so.

(I wish this guy was sued) comes with a pre-generated list of 3047 active blog websites, and a good User-Agent list of real User-Agents taken from real statistics is also included.

So, time to get out of that list, then. I'm fundamentally fed up with both the asinine idiots who think their sites will get more traffic from using this sort of thing and the sleazeballs that make a profit from the whole thing. So I'm tearing down my Referrers page at the first opportunity - i.e., I just did.

Here are a couple of old posts on this topic, so that you can figure out how common this is:

Pattern Recognition (one year ago)
I Hope Kelkoo Gets A Clue (or how to do clueless web marketing)
Trojans and Weblog Spam (which has more links)

Oh, and don't bother e-mailing me about this one. I won't bother replying, and I will be keeping a close eye on my spanking new snort installation, thanks to the Growl notifications I bolted on to it.

Come on, refer me.

Make my day, punk.

Tao of Mac

Referrer Spam Should Be A Crime

The Nuisance, And Why It Doesn't Work

Who Profits From It, And The Marie Celeste Effect

Know Thine Enemy

This page is referenced in: