Wow. I seriously miss something, because story with supposedly Bing stealing results from Google just doesn’t make much sense.
I’ve read the entry. I was also quite tempted to support Google’s “they’re cheating” scream but then I stopped and re-read the story more carefully.
So, Google engineers tested assumption that their search results somehow leak into Bing. To test, they worked with the long tail: added a made-up word with result showing exactly one irrelevant page, that couldn’t possibly be associated with that word through any normal means.
Then they give laptops with Windows, IE8 and Bing toolbar to engineers, who google for that funky word and click on the result. In a couple weeks Bing.com starts to respond with exactly the same page to that same made-up word. Voila, Bing must be stealing Google’s data!
Except there’s one very simple and easy explanation. Bing toolbar reports to Bing your “interactions with search engines”, including click-stream. And, as the test is on a word that had absolutely nothing matching it in the whole internet (until today, when search results for “hiybbprqag” are probably exploding, thanks to all the blog posts) Bing had one very relevant, important, and the only source for associating link and the magic word. A number of clicks from Google engineers, who look like regular users.
Of course privacy on, for example, Rewards explicitly says that search engine use is tracked:
In order to reward you for your participation in the Bing Rewards Preview, you need to download the Bing Bar which contains the Reward Counter. The Reward Counter collects information about your interaction with Bing and different search engines including the number of web searches you do each day, the types of searches you complete (such as for news or images), and the number of search ads you click on.
And I’m sure there are other, more vague (thank you, legalize-speaking lawyers) verbiages about how all your online wandering is used to improve Microsoft’s services, including that very same Bing search.
So, carefully crafted, clearly engineered, long-tail result gets associated with a fake page thanks to a click-stream data. Does that qualify as “stealing results”? Hardly, because the source is user interaction. Did engineers capture all of the net traffic? Can they say with 100% certainty “No data was transmitted between our laptop and any of Microsoft services that contained anything remotely seeming to link our magic word and a URL”? I haven’t seen it in their blog post. If they did include that aspect of data collection, then why didn’t they say so?
I guess other possible reasons for Google-Bing-Gate is that someone overreacted. “Bing results are too similar to us”, that someone thought, and abovementioned experiment “confirmed” their suspicion (though frankly I think the only confirmation would be if “magical” results appeared in Bing without any interaction by humans) and thus the blog post was born.
Maybe a mere thought of using click-stream to “add” to results was a heresy in the mind of any Googler (“We’ll be google-bombed into oblivion!” — except here there are no results to be bombed, remember?) Maybe the experiment was their attempt to prove that Google is the only engine that could possibly give relevant results?
I agree, Google is quite relevant in most cases, and I use it daily. But simple logic dictates that if other engines improve over time, all results will converge to some abstract “best”, regardless of how engines learn to get to that point. And if someone is watching click-stream, then the fact that users clicked on something on the google’s search results page would kinda imply “using google’s results”. Along with everything else.
So, the whole story is a bit too hysterical, a bit too stretched, and a bit nonsensical. I hope there will be a clarification about absolutely excluding click-stream/”user experience improvement program” from being a hint to Bing.com that blah-blah-unique-word should go to some specific URL. Pretty please? Also it would be interesting if they actually created two results, and everyone would click on only one. Would that be the one that appeared in Bing reply? Or both? Alas, we’ll never know, because even if this experiment was to be conducted now, this outcome would be written off to them “changing logic after accusation went public”. *sigh*