Popularity and Spurious Statistics

I’ve just been observing a slightly bizarre email thread about the whatdoestheinternetthink?net site, which is apparently aiming to be the place to go if you want a global enquiry tool to find out what the online world thinks about any given subject. You enter a search term, it submits to one or more search engines, and it comes back with a percentage score in three categories: positive, negative and ambivalent. (An actual search comes back with “don’t care” rather than ambivalent, and I don’t think that’s quite the same thing, but let’s not be picky.)

Well, it’s reassuring to note that the search term “ESET” scores 94.3% positive at the moment whereas Symantec scores 30.2% , and McAfee a  heartrending 25%.  (Sorry  Mark, Igor et al! ;-)) 

However, it seems that we’re all outclassed right now by Microsoft Security Essentials, with a resounding 100% approval. (I figured if I searched just on Microsoft, I’d get a lot of security-unrelated hits that would totally skew the results.) In fact, that last result may be skewed slightly by the fact that it’s apparently based on a single google hit. So much for the Wisdom of Crowds. :-D

And that makes an interesting point about how to lie with statistics. I’m not much of a statistician, though my father was: his copy of Duff’s book was one of the first serious books I read. But you don’t need to know your mean from your median to realize that:

  • A brand new pre-release product hasn’t had much time to generate negative opinions
  • The bigger a company’s profile, the more comment will be made about it on the Internet (and in the real world, of course)
  • There’s a likelihood that over time,  more adverse than positive comments will be made about a specific product, human nature being what it is
  • You can get pretty much any positive result you want, if you’re prepared to spend time tweaking the search terms.

So even if we knew anything about the classification criteria and used by the site’s search algorithm, which we don’t, I wouldn’t advocate that you try to draw any real conclusions about the popularity or value of any vendor or product from this particular instance of lies, damned lies and statistics. Especially in the light of a little experiment carried out by a colleague at ESET UK (thanks, Quinton!): it turns out that people are overwhelmingly in favour of Ebola. Unfortunately, the site doesn’t tell us whether it’s the river, the virus, or the haemorrhagic fever that people are so fond of. Or maybe the fact that there are several musical acts, a cartoon web site and a movie with the same name tells us something. Maybe the algorithm needs a little work, guys. Or maybe some clarification as to what it actually does. Though to be fair, the disclaimer at the bottom does say that the results are provided as-is and are not reliable. :)

Given the mauling that John Lennon received in the 1960s after suggesting that the Beatles were more popular than Jesus, I think I’ll let you find out for yourselves whether a search on http://www.whatdoestheinternetthink.net supports that suggestion. Or for some real fun, try varying the search terms to see how easily you can skew the results either way.

And that’s a real problem: I can actually envisage people generating all sorts of spurious results in the way I did above and using them misleadingly in a PR context, in much the same way that they misuse VirusTotal statistics.

Director of Malware Intelligence

Author David Harley, ESET

  • Great article! I consider it a compliment getting such an elaborate study of my humble fun-project. And: spot on.

    Perhaps the biggest message my website will give you is that internet is not reliable at all, people should make their own decisions and base their conclusions on their own research.

    However: it *is* a fun frankengoogle-app ;)


  • David Harley

    Thanks for that! Glad you liked the blog. I must admit that I had rather more fun than usual researching it. :)

Follow us

Copyright © 2017 ESET, All Rights Reserved.