Way back on August 17th, the first reports about MSN sending weird referrer spam, started trickling in. This referrer spam includes weird referral strings such as adult keywords, pharmaceutical keywords, etc. A few weeks later, msndude, made a statement about this saying they were official Microsoft quality check tests. Specifically, he said, "The traffic you are seeing is part of a quality check we run on selected pages. While we work on addressing your conerns, we would request that you do not actively block the IP addreses used by this quality check; blocking these IP addresses could prevent your site from being included in the Live Search index."
For various reasons, msndude's answer is completely unsatisfactory, and the fact that this is still happening months later, without any further clarification from msndude is even worse. I've had some people bugging me to look into this for a while now, but I assumed it would just disappear and not be worth the time. Now, I have to wonder if that was a mistake on my part.
Something fishy appears to be going on, and the one comment from msndude about it doesn't get rid of the fishy smell. I think a little more noise will need to be made if we hope to get some real answers about this.
Here are a few points from previous forum threads and blog posts that you might want to think about:
I am getting thousands of hits where the items in my log show the referrer as follows;
When I load the referred page then I am told that there are no results. Also there is no relationsfip between the keyword and the page requested. The Kkeywords are single words and seem to be mainly concerned with the normal spam areas.
- Landing pages appear to be pseudo-random (seemed to request 'batches' of related pages)
- IPs changeable in the range 65.55.165.*
- Spoofed referrers are search.live.com/result.aspx?q=[KEYWORD]&mrt=en-us&FORM=LVSP. However, the bot also makes requests without a referrer, some times in the course of the same visit
- Keywords are usually single words, varying from obvious commercial (but inoffensive) to drug names and pr0n-related words
To add to it, they started hammering my poetry site on Thursday, and they were also downloading my AdSense blocks, completely inflating my stats. Myblogblog however, didn't recognize them as visitors, so the discrepancy was immediately obvious.
For several weeks I have been annoyed by this "quality check" as it imposes it self as real behaviour with full fledge browser capabilities and a standard user agent. All of the sudden my customers are all excited over getting all this search engine traffic from Live Search which they are in fact NOT.
some people on other forums think that MSN is sending them traffic.
That's one way to make it look like someone is using their SE.
am getting severely hacked off with this. My logs are the most important tool I've got and they are being wrecked.
My opinion is simple, as I've stated here in WebmasterWorld for two months also, that Microsoft is intentionally screwing up our log files for their own gain.
It takes the .css and .js stuff despite it being off limits in robots.txt.
All of the above quotes come from the WMW thread mentioned in the first sentence of this post. The thread is fairly long, so although I don't usually "rip off" this much content, it seemed the only way to really show the meat of the problem without making everyone read the entire thread (which of course, 99% wouldn't bother doing).
Surely, there's something really bizarre about all of this, and I think everyone deserves a better answer than msndude initially provided. In addition, it needs to STOP happening. Microsoft, just stop it. It's wrong. Period.