Larry Seltzer posted an interesting item yesterday. The article on "SW Tests Show Problems With AV Detections " is based on an "Analyst's Diary" entry called "On the way to better testing."
Kaspersky did something rather interesting, though a little suspect. They created 20 perfectly innocent executable files, then created fake detections for ten of them. Then they uploaded (actually, the blog says re-uploaded) the files to Virus Total. They claim that after ten days "up to" 14 other vendors also had detection for the files for which Kaspersky had a fake detection.
(No, ESET wasn't one of them! We don't have a marketing axe to grind here.)
In fact, several vendors only detected one or two of those files, not the whole set. Furthermore, in at least one case all the files were detected as "suspicious" rather than malicious, and only by that company's online scanner: in that context, it's possibly defensible, certainly as a short-term flag.
So is there really a problem here? When Virus Total sends a sample of what may be a missed detection to the company whose scanner missed it, they probably don't expect the company simply to add detection without validating the sample: otherwise, a false positive could cascade through the industry in a very short time. [Update: there's an interesting related blog from Hispasec, who provide the Virus Total service, at http://translate.google.com/translate?u=http%3A%2F%2Fwww.hispasec.com%2Funaaldia%2F4013&sl=es&tl=en&hl=&ie=UTF-8 - I'm afraid it's an automatic translation, but you get the idea... (Thanks, Pedro, for calling my attention to it.)]
But as Larry suggests, there are other factors at play there, rather than a simple mindset of "if Kaspersky detects it, it must be malicious". With all due respect to the company, no-one in this business has escaped problems with false positives, as John Leyden pointed out today in an article in the Register.
In the Kaspersky blog, Magnus suggests that this is really a problem with testing. Well, he has a point. Where tests are carried out with huge sample sets and minimal time and financial resources, it's inevitable that some testers will rely on "source reputation and multi-scanning" rather than manual validation. In fact, Ján Vrabec recently cited problems with a test where a much smaller test set was "validated" according to whether four or more scanners detected it as malicious.
However, AV companies (and the major test organizations) don't, in general, rely on these approaches. Obviously, a significant proportion of analysis for a virus lab has to be manual, though where tens of thousands of samples are received daily, there has to be as much automation as possible. So it may be that there are many factors at work here, such as:
In fact, the problem Kaspersky has flagged may have longer-lasting effects than the company has acknowledged. Larry Seltzer's article seems to say that subsequently, a "hello world" program generated with the same compiler and compiler settings was also flagged by at least two programs. Could this be a compiler-specific heuristic implemented as a result of Kaspersky's artificial false positives? If so, Kaspersky also bear some of the blame.
The sad thing is, that genuine problems may take second place here to the easy story of "who copies from whom." By going straight to the press and presenting journalists with the innocent executables, encouraging them to use Virus Total (inappropriately, in our view) to check their conclusions, Kaspersky has made inevitable what they flagged as a risk. Reproducible experimentation is a worthy target, as long as the experimental methodology is sound, but does that really apply here?
But the real problem here is that Magnus is perpetuating a fallacy that already worries many people in the industry. He suggests that the problem here is with static testing, and that the move to dynamic testing is going to fix things. However, that's neither the problem nor the cure in this case. The problem is non-validation, and the cure is validation. Right now, lots of tests claim to be dynamic, because that's what AMTSO advocates. Quite rightly, in principle: good dynamic testing has the potential to be a more accurate test of a product's efficacy than static testing. But good static testing remains a better approach than bad dynamic testing, which few testers are doing well at the moment. A significant part of that problem is, unfortunately, still validation.
The Research Team
Author David Harley, ESET