[Update: Michael St Nietzel also pointed out that there's an issue with installers that verify a checksum before installation. In fact, this is a special case of an issue I may not have made completely clear before: unless this approach is combined with some form of whitelisting, there has to be some way of reversing the modification somewhere in the process for files that turn out to be legitimate. That will provide interesting implementation challenges.]

I was rather scathing recently in a blog for AVIEN (the Anti-Virus Information Exchange Network) about a New Scientist report that described a pending patent from Qinetiq: since that blog was picked up by the Register, perhaps I'll qualify my "sarcastic" comments here.

Notes to New Scientist:

  • You might want to make it clear whether you're talking about viruses or malware. Viruses are a vanishingly small (though still significant) proportion of the malware problem. Trojans of one sort or another give us far more grief, in general, nowadays. I know that many people use the terms malware and viruses interchangeably, but people expect more precision from a science writer.
  • A patch is not the same as an update to a definitions/signatures database to counter a specific threat or threat family.
  • If this industry was really still reliant on the "see a threat, write a signature, distribute a signature" cycle, we'd all be in a lot of trouble right now. Well, maybe we are, but not in that much trouble. Most companies devote a lot of research and development effort to proactive detection. Of course, some are more successful at it than others. ;-)

In fact, checking the actual patent application, the idea dreamed up by Qinetiq isn't as dumb as New Scientist make it sound, and it would be unfair to assess the value of the idea purely on a journalist's interpretation. Clearly Qinetiq has done some thinking around the issue, and some research into prior art. I particularly like the repackaging of the EICAR test file (with a new text string displayed) in "BACKGROUND TO THE INVENTION"  [0004]. Let's hope that EICAR (http://www.eicar.org) didn't patent it. ;-)

But it hasn't solved the virus problem (let alone the malware problem). Well, of course, that's not what patents actually do. They summarize an approach to solving a problem: they don't generally  provide a map of the implementation of the solution.

Nonetheless, this approach seems less straightforward than the New Scientist report claims..

It isn't really a catch-all generic solution: it relies on the insertion of "strings of arbitrary length" within computer files of known type". In other words, while it offers a possible approach to preventing certain known types of threat from executing, it doesn't seem to offer blocking of all potentially executable files. Actually, this is a positive, if it means that code won't be inserted randomly into a file of unknown type, though I'm by no means sure that this is what it means.

  • But if your filter doesn't recognize the format of file (or even whether it's executable), how can you predict the impact of the introduction of an arbitrary string into an arbitrary location?
  • How can you possibly guarantee that it won't affect the legitimate functionality of a legitimate data file? (Consider, for instance, the surprising range of differences that may be found between PDFs generated by different applications.)
  • Can you reconcile the introduction of termination or looping sequences into code with legislation addressing issues like unauthorized access or modification? The patent application rightly draws attention to the fact that there is often no hard-and-fast distinction between code and data, but what about the equally blurry border between code and systems? If anuntrusted file on my system goes off into an infinite loop because of a byte string inserted by your filter, are you sure I authorized your modification to my system?
  • The patent seems largely focused on executable code passed off as data. What about executable code embedded into data files? (Macros, scripts, embedded objects...) Will such code always be recognized and blocked in known data file types? If so, is that a good idea?
  • The patent seems totally focused on either adding or deleting strings. How do you either without invalidating a digitally signed file? (Thanks to Michael St Neitzel for pointing that issue out - I'm not sure how I managed to overlook it in my original post!)

So, an interesting idea, but based on a number of assumptions that had already crashed and burned before the end of the last century. One more issue: in the New Scientist article, Ross Anderson was quoted as saying "Now that Qinetiq have patented this idea nobody will use it, even if it works. Patents are seen as damage: people route around them." True. And, as John Leyden remarked in that Register article, "Patents are designed to allow developers to stake out areas of technical innovation. However, in the fiercely competitive anti-virus market, they've more often been used as legal and marketing weapons."  Unfortunately, though, misuse of the patenting process in certain cases has resulted in security companies feeling obliged to get their patents in first. I don't know if Qinetiq are hoping to keep what they see as an innovation to themselves, or simply trying to forestall having it taken away from them by someone with a sharper set of lawyers....

David Harley
Director of Malware Intelligence