Malware Classification and The Lovely Bones

You might have noticed that there are certain issues that press my buttons: the Beeb's botnet, Mac myopia, using Virus Total as a substitute for comparative detection testing. And malware naming, an issue on which I've blogged several times recently.

http://www.eset.com/threat-center/blog/2010/01/09/today-we-have-naming-of-err-malware-1
http://avien.net/blog/?p=121

The estimable Kurt Wismer has taken me to task – well, Tom Kelchner and Mary Landesman too – for approaching the issue from the wrong angle. (See http://anti-virus-rants.blogspot.com/2010/01/whats-in-malware-name.html.)

Well, I guess I agree with him more than I disagree. Kurt says:

"what i have in mind is something not unlike the now defunct common malware enumeration with the exception of using names instead of numbers – a post hoc harmonized second name (a common name or layman's name) for those few pieces of malware that the industry feels they need to communicate to the masses about."

And he's right: we don't really need multiple spellings and synonyms for a common name (the so-called Stormworm, Conficker, whatever): the industry should at least be more scrupulous about cross-referencing names so that our audience knows when we're talking about the same thing by a different name. But there is a difficult: you can only be sure that we are talking about the same thing at a very generic level. You can't even assume that one company's W32/Nastymalware.A is the same as another company's Troj/Nastymalware.A because naming doesn't only derive from the code family, but from other factors – notably from the detection algorithm, which may reflect quite generic features such as the infection vector, or the type of botnet component it happens to be.

Kurt's analogy with the naming of bones is interesting and alluring, but I think it's misleading. The human skeleton is an aggregation of more-or-less finite components: you might even say the same of the rather more complex human genome: what we analyse in computer virus labs is a far more fluid target. In any case, it's rarely critical to the patient (let alone the former patient) to identify the exact bone(s) you managed to break at some point in your life. "I broke my ankle" or "it was a Potts fracture" is specific enough for most such conversations. Identifying the precise multimalleolar fracture is generally of use and interest to a small and specialized group. Not that I claim that the security industry is a very close analogue to the medical profession – well, in some instances there's a characteristic arrogance seen in both sectors! – but in this instance, there is a similarity between security researchers and medical researchers.

If you go to your doctor with symptoms suggesting an infection (probably a better analogy than a broken bone), he's most likely to prescribe generic measures such as bedrest and anti-pyretics. In the first instance, at least, any infection-specific chemotherapy he offers will probably be broad-spectrum. Only in a minority of cases is he going to initiate testing for exact identification of a strain or substrain. This isn't a perfect analogy to the way the anti-malware industry works either – for a start, we'd have to factor in a whole raft of alternative therapies – but it's closer than dem dry bones. :)

To revert back to a point made in the 2008 paper by Pierre-Marc and myself, detection is more generic than most people realize: that doesn't mean we can't do exact identification, only that we only expend that sort of effort on an individual sample when we need to. And to go back to one of Tom Kelchner's points, wouldn't you rather we did it that way, as opposed to analysing and classifying each of tens of thousands of daily samples?

The industry's real failure here less its inability to harmonize than its continuing inability to communicate why harmonization isn't (and shouldn't be) top priority. In the end, it's down to this: harmonization between object and name is easy enough in a one-to-many relationship, but the contemporary threat landscape is largely about many-to-many.

David Harley BA CISSP FBCS CITP
Director of Malware Intelligence

ESET Threatblog (TinyURL with preview enabled): http://preview.tinyurl.com/esetblog
ESET Threatblog notifications on Twitter: http://twitter.com/esetresearch (or @ESETblog)
ESET White Papers Page: http://www.eset.com/download/whitepapers.php

Securing Our eCity community initiative: http://www.securingourecity.org/

Also blogging at:
http://smallbluegreenblog.wordpress.com/
http://avien.net/blog
http://blogs.securiteam.com
http://blog.isc2.org/
http://macviruscom.wordpress.com/

Author David Harley, ESET

Follow Us

Automatically receive new posts via email:

Delivered by FeedBurner

26 articles related to:
Hot Topic
ESET Virus Radar

Archives

Select month
Copyright © 2014 ESET, All Rights Reserved.