Microsoft Admits to Live.com Referer Spam

(Sorry for two search engine spammish posts in a row, but all I can do is obey the voices in my fingers. I have no control over them whatsoever, and anything that the court-appointed psychiatrist says to the contrary should be ignored.)

For several months now many websites have been getting referer spammed by Microsoft’s Live Search. You’d get what looked like a regular GET request from an IP address in the 65.55.165.0/24 block, an Internet Explorer user agent, and a referer string that indicated that someone searched for some weird word on Microsoft’s live.com and clicked a link from there to your site. Except these were usually pharmaceutical and porn words, nothing that people actually had on their sites.

So, on the one hand it was pretty weird. Spammers do this kind of thing all the time, and webmasters are used to it, but that IP block belongs to Microsoft, and this was at a much larger level than spammers usually do.

Second, it messed up your stats. Any analytics that tried to see what search terms people used to find your site were skewed to hell because of all the fake ones Microsoft was injecting, especially for small sites — like all of mine — where the fake Microsoft referrers were overwhelming any real ones. And God forbid any real person actually used live.com to search and find your site, because you’d never know buried with all the identical-looking fakes.

Third, it messed up your Adsense statistics, which can affect your Adsense earnings. This apparently wasn’t a regular web bot, this was an actual web browser capable of interpreting Javascript, which meant that when it kept hitting your page, it kept counting as a real user impression as far as Adsense was concerned. But, of course, it never “clicked” on an ad, which meant your clickthrough rate dropped. That’s the kind of thing that can make Adsense think your site isn’t a good site to show ads on. That’s messing with any money you might have made from Adsense.

Then, of course, the conspiracy theories started flying. Why would Microsoft want to go out of their way to interfere with Adsense, a competitor to their own Microsoft adCenter product. Hmmmm….

Well, today Microsoft came clean. Yes, they’ve been faking referers. No, it wasn’t an accident. No, they’re not going to stop (although they say it’ll drop to “almost nothing” soon.) They say it’s an attempt to find cloaking, which seems like a poor way to try. Set up some obvious fake referrers, get all the webmasters in the world to pay close attention to what you’re doing, and use a Microsoft IP block? Any cloaker will just code against those IP addresses now.

Yes, they’ll try to fake referers that are more on topic for your web site. Is it just me, or is that worse? Now it’ll be tough to spot the fake ones, since it’ll look like someone really searched live.com and then found your site. That means that a webmaster looking at their logs to assess the popularity of the different search engines will rank Microsoft higher than it really should be, since it looks like more people are using it than actually are.

They address the Adsense issue, but not directly:

Initially there was a bug in our crawler that caused it to download all content on your page, including ad blocks. We have since fixed this issue by blocking requests to Google and Overture to preserve the integrity of your reporting.

That’s not a fix. That’s a workaround, and only for Google and Overture. What about other ad programs? My reading is that they’ll keep racking up the impression count for them, since they didn’t specifically block the traffic. Does the bot really need to run Javascript? Like I said earlier, this project is completely useless to detect cloaking at this point, since the IP range is well-known (thanks to the blatant weirdness of the referer spamming in the first place.) Since you’re not going to find cloaking, why keep messing with the Javascript at all?

And what’s with the huge number of requests on a single web site? If they really are looking for cloaking, why check hundreds of times on a single low-traffic site that hasn’t shown any signs of cloaking in the past? Why note try once and move on to meatier, spammier targets?

You know, to be honest, I’m surprised people aren’t more pissed off about this. The outrage over Google’s paid link stuff is far greater than the outrage over what Microsoft is/was doing, but I think Microsoft’s actions are much more rage-worthy.

Thoughts? Opinions? Large bundles of unmarked bills to be delivered to me by a skimpily dressed Megan Fox? All are welcome (some more than others.)


About this entry