I returned yesterday from Paris, where I attended the iAWACS and EICAR conferences. One of the papers I co-presented at EICAR was on performance testing (as opposed to detection testing). It was written by Ján Vrabec and myself, and it's called "Real Performance?" Here's the abstract:

The methodology and categories used in performance testing of anti-malware products and their impact on the computer remains a contentious area. While there’s plenty of information, some of it actually useful, on detection testing, there is very little on performance testing. Yet, while the issues are different, sound performance testing is at least as challenging, in its own way, as detection testing. Performance testing based on assumptions that ‘one size [or methodology] fits all’, or that reflects an incomplete understanding of the technicalities of performance evaluation, can be as misleading as a badly-implemented detection test. There are now several sources of guidelines on how to test detection, but no authoritative information on how to test performance in the context of anti-malware evaluation. Independent bodies are working on these right now but the current absence of such standards often results in the publication of inaccurate comparative test results. This is because they do not accurately reflect the real needs of the end-user and dwell on irrelevant indicators, resulting in potentially skewed product rankings and conclusions. Thus, the “winner” of these tests is not always the best choice for the user. For example a testing scenario created to evaluate performance of a consumer product, should not be used for benchmarking of server products.

There are, of course, examples of questionable results that have been published where the testing body or tester seem to be unduly influenced by the functionality of a particular vendor. However, there is also scope, as with other forms of testing, to introduce inadvertent bias into a product performance test. There are several benchmarking tools that are intended to evaluate performance of hardware but for testing software as complex as antivirus solutions and their impact on the usability of a system, these simply aren’t precise enough. This is especially likely to cause problems when a single benchmark is used in isolation, and looks at aspects of performance that may cause unfair advantage or disadvantage to specific products.

This paper aims to objectively evaluate the most common performance testing models used in anti-malware testing, such as scanning speed, memory consumption and boot speed, and to help highlight the main potential pitfalls of these testing procedures. We present recommendations on how to test objectively and how to spot a potential bias. In addition, we propose some “best-fit” testing scenarios for determining the most suitable anti-malware product according to the specific type of end user and target audience.

We're not able to re-publish the paper on a commercial site, but you can download the full paper from here.

David Harley CISSP FBCS CITP
Research Fellow & Director of Malware Intelligence

ESET Threatblog (TinyURL with preview enabled): http://preview.tinyurl.com/esetblog
ESET Threatblog notifications on Twitter:
http://twitter.com/esetresearch; http://twitter.com/ESETblog
ESET White Papers Page: http://www.eset.com/download/whitepapers.php

Securing Our eCity community initiative: http://www.securingourecity.org/

Also blogging at:
http://amtso.wordpress.com/
http://avien.net/blog
http://blogs.securiteam.com
http://blog.isc2.org/
http://macvirus.com/
http://chainmailcheck.wordpress.com
http://smallbluegreenblog.wordpress.com/