Larry Seltzer, one of the better commentators on malware issues, has picked up on the disparity between ESET's naming of the latest variant and Symantec's - they call it W32.Downadup.E. Richard Adhikari (who also seems to pretty clueful) also picked up on the naming issue when we exchanged emails a few days ago.

This issue kind of echoes a question that came up at a presentation I gave recently. "Will vendors ever sort out the naming mess?" The short answer is probably no. Because it's no longer about naming the malware: it's about naming the detection.

There was a time when the model of one name to one variant made sense. In general, one boot sector infected by a given boot sector infector looks much like another.  There was even a convention for naming established by CARO, which many vendors still follow, to an extent. And there was (still is) a publicly available utility (vgrep) that crossmatched names across most of the mainstream vendor detections.

Sadly, though, the present threat landscape defies any attempt to categorize samples more than roughly, if at all.

In general, people want to know whether we detect specific malware because, not unreasonably, they want to be reassured that they're protected against it. But because most malware is tweaked by repacking and reobfuscating at frequent intervals, the sheer glut of samples (anti-malware labs process tens of thousands of unique samples a day) makes that depth of analysis impractical. Fortunately, adding detection for a threat doesn't generally require an exhaustive manual analysis, and ESET puts a lot of Research & Development effort into proactive detections that lessen our reliance on near-exact identification of known samples.

One of the side-effects of this approach is that the name we give to a detection doesn't necessarily tell you much about individual samples. To take a simple example, there is a very wide range of malware that (mis)uses the Autorun faclity as an infection mechanism*, so two malicious programs may be identified using the same detection label even though they have no shared code or any other perceptible relationship at all. Sometimes, the detection label used has more to do with the method of detection than the code family to which it belongs.Pierre-Marc and I presented a paper at Virus Bulletin last year in which we suggested that:

  • Effectively, the "name" of a malicious program is effectively (usually) its hash value.
  • More and more, it's the proactive detections that count, not the exact/near-exact identifications (signatures, loosely speaking).
  • The industry as a whole needs to coax its customers away from this unrealistic model, rather than going along with the near myth that names mean the same thing that they did even ten years ago. 

(Some people might find that paper a bit geekish: I plan on revisiting the theme in the near future, aiming at a less specialist audience. There's also a less geeky article here, but it isn't free..

As I understand it, our Conficker variant names have diverged because we were tracking variants quite assiduously at the beginning, at least until we put together a particularly effective generic signature which is still working quite nicely, thank you, and detecting a range of variants and subvariants. So our Conficker variant names have been divergent from the beginning. It's not that we claim that our classification is any better than anyone else's (though it might be :)), simply that there's no real standard for classification, and it isn't usually practical or useful to try to enforce one retroactively.

* Including Conficker, though the latest versions seem to have dropped that approach, perhaps in the hope of reducing susceptibility to generic detections.