By David Harley
As previously mentioned here, I presented my 16th Virus Bulletin paper at the Virus Bulletin 2017
conference in Madrid on Thursday 5th October. The paper The (testing) world turned upside down is now available here by kind permission of Virus Bulletin. [1]
The paper doesn’t address everything I ever wanted
to say about product testing, but it does touch on quite a few of the topics
that continue to raise my blood pressure, so my intention is to use it as a
springboard from which to talk about some of those issues in a series of blog
articles. In the first of those articles, I’m going to talk about malware
simulation. Simulators are not always used specifically in product testing, and
they’re not (hopefully) deliberately used to mislead (usually), but the harsh
truth is that people do tend to expect more from a simulation than it can deliver. [2]
How to bias testing
Here’s a (pseudonymous) quote from a 1993 vintage article in Virus News International [3]
on how to bias testing. It’s amazing, given the age of that article, how many
of the techniques it describes (or their offspring) can still be seen in
(mis)use in 2017:
“Don’t use viruses at all. Use simulated viruses.
Assume that the simulation is perfect and that therefore all products should
detect them.”
Subsequently, a well-known open letter [4] from the year 2000 objected to
simulated viruses, with particular reference to the Rosenthal Utilities,
a suite of programs considered by the security industry to be of no practical
use and in some circumstances potentially dangerous. The signatories to
that letter (and yes, I was one of them) noted that:
‘…using simulated viruses in a product review inverts
the test results … because:
·
It rewards
the product that incorrectly reports a non-virus as infected.
·
It penalizes
a product that correctly recognizes the non-virus as not infected.’
(Bizarrely, one of the signatories of that letter
still represents a testing organization that has recently (reportedly) conducted a sponsored test that uses ‘created’ or
‘simulated’ malware, as well as disabling layers of functionality. [5])
Around the time I first started to work closely
with ESET, we saw yet more simulated viruses, courtesy of the ‘Untangled’
so-called Anti-Virus Fight Club test, [6] standing as representative of
many other tests and reviews where the tester used:
…simulated viruses, virus fragments, or even
“virus-like” programs, whatever that means.
EICAR: tests for installation, not detection
Strangely, the Untangled test used several
instances of the EICAR test file, which is not exactly a simulation and is certainly not suitable for comparative testing [7] unless the
test objective is along the lines of ascertaining ‘does this product detect the
EICAR test file?’
True, most malware isn’t viral any more, but I
don’t think that matters in this context. Nor, of course, is the EICAR file a
virus, and doesn’t in any respect behave like one. Well, the EICAR file
isn’t really a simulation so much as an installation check, rather like the AMTSO security
features check utilities (many of which use the EICAR file).
The EICAR file and the AMTSO (Anti-Malware Testing
Standards Organization) feature settings checks ‘work’ because much of the
industry agrees to recognize them, not as a simulation but as an indication
that a security product is installed and ‘awake’, though it’s not a guarantee
that it’s fully functional.
Perhaps it’s a little unfortunate that AMTSO chose
to include installation tests on its web site, in that – because the
organization is so much associated with ‘proper’ comparative and certification
testing – people might think that AMTSO’s offering of installation tests in
some way legitimizes their use in comparative tests. However, some people and
organizations certainly seem to find installation tests useful, and AMTSO has
made its views quite clear in a guidelines document on the Use and Misuse of Test Files.
Simulations and FPs
There’s been an avalanche lately of malware
simulators (especially ransomware simulators) that do not differ much from
those installation checks and old-school virus simulators. Some virus
simulators relied on embedding non-functional snippets of viral code, expecting
security software to be triggered by recognition of a simple scan string,
rather than by more thorough and complete identification mechanisms. However,
this approach was based on a basic misunderstanding of how even (much)
stone-age signature detection worked. There was never a single ‘universal’
static signature for each malicious sample, used by all scanners to identify
it.
Nowadays, sole (or even frequent) reliance on
static signatures in mainstream security products is, in any case, ancient
history, or a figment of the imaginations of so-called next-gen marketroids.
Here’s an extract from a paper that we (Lysa Myers, Eddy Willems and I) presented at
AVAR several years ago, describing Doren Rosenthal’s notorious (and by that
time long-discredited) virus simulation:
“disliked by AV researchers in general, for a
number of reasons:
·
They were
based on the false premise that a product that detects a real virus should also
detect a simulation of that virus.
·
The previous
false premise is also based on the equally false premise that a virus signature
is some unique footprint and that all security software uses the same
signature. There is, of course, no reason why random virus fragments should be
detected as if they were a real virus, and even less reason for several vendors
to use the same signature. “
To be fair, the registered version of the Rosenthal
utilities did, in fact, include “a real if relatively innocuous virus.”
This could therefore be described as a rare example
of a simulation displaying something close to viral behaviour. However, the
inappropriate use of Rosenthal’s programs by some testers in comparative
testing “forced the AV industry to add detection of yet another non-virus to
the impressive selection of garbage files, intendeds and other inappropriate
objects it needs to detect if a product isn’t to be penalized for not detecting
what it shouldn’t need to detect.”
Present-day ransomware simulators tend to take a
more generic approach: rather than impersonating specific malware, they may attempt
to simulate malicious behaviour. However, they really don’t behave much like
real ransomware. In the Good Old Days, it was at least arguable that ‘viral’
behaviour (i.e. self-replication) is a usually-reliable indicator of malicious
behaviour, which tends to make heuristic detection of ‘real’ viruses
comparatively easy. However, ransomware simulators usually focus
(understandably) on file encryption, which may or may not be malicious. After
all, it’s also a primary function of some security software! Ransim
illustrates exactly this difficulty: in order to behave ethically, the
simulator is designed
to introduce its own files and then encrypt them.
“There’s no reason why a
process that only encrypts files it has just created itself should be diagnosed
as malicious, let alone as ransomware.”
But avoiding encrypting ‘existing files on disk’ is
pretty much the opposite of what real ransomware does. There’s no reason why a
process that only encrypts files it has just created itself should be diagnosed
as malicious, let alone as ransomware. Encryption is a legitimate function: it’s
only the context that makes it malicious. Otherwise, Microsoft Word’s
operations when a user chooses to password-protect an existing document should
also be detected as ransomware, which is patently ridiculous. However,
encryption covertly performed by ransomware on files that are accessed and
modified without the authorization of their owner clearly is malicious.
So how does a simulator do what real ransomware
does without being real malware? I am not convinced that’s even realistically
possible to do ethically and in absolute safety, but it can’t be done usefully
by blocking ‘operations’ without taking into account what suspected malware is
really doing. Modern security software is not generally concerned with
simplistic signature scanning, but with techniques that analyse how software
behaves in order to establish whether it has some possible or probable
malicious intent. It’s the ability to distinguish between the legitimate and
malicious execution of an operation that makes a competent security program
effective. While it’s too much to hope that security software will always get
it right, it seems to me that there are only two ways that it can always
detect simulated malware.
1.
By flagging
as suspicious every operation that might just be malicious, irrespective
of the inevitable false positives. And, even worse, letting the user decide whether to block or accept execution.
2.
Detect the
simulator rather than the malicious process it’s intended to emulate. In
general, those who generate simulators cry foul when this is done, and it does,
of course, invalidate the purpose of the simulation. But that’s a problem that
arises from the dubious nature of the simulation approach, not from the
incompetence or malfeasance of security vendors.
It’s not just that malware simulators generally
fail to emulate real malware successfully. The security industry hasn’t agreed
to recognize them as having some sort of EICAR-like functionality, either. If
they were to be flagged by mainstream anti-malware, it probably ought to
be as something analogous to a PUA (Potentially Unwanted Application).
Something like ‘Quasi-Simulator Non-Malware’. Which is a polite way of saying
‘brain-dead intentional false positive’.
Pussyfooting
This might be the right time for me to introduce
you to my cat.
We call him Ike. Short for EICAR.COM. He is a
simulated tiger, but that doesn’t make him suitable for tiger detection
testing. (We also have a dog called Little Bobby Tables.)
Sims and Sinners
A simulation of an attack is, by definition, not a
real attack, even if it’s a ‘good’ simulation (which would be a novelty). It is
someone’s conception of what an attack should look like, but is not truly
malicious. In general, the security industry avoids detecting simulated
attacks, as not only are they technically false positives, but they’re also
founded on dubious assumptions about how malware and anti-malware programs
work. Simulations are usually based on a snapshot view of some malware, and
disregard differences in malware and anti-malware technologies. They almost
invariably mislead people who expect them to be accurate. In fact, they prove
little or nothing about what a product detects except that it detects a
specific simulation.
Companies choosing to detect simulators are like
those companies who used to detect garbage files, simulations, etc. known to be
present in commonly-used sample collections, to avoid being penalized in poor
but influential tests. In other words, they avoid competitive disadvantage, but
legitimize poor tests. In general, products that detect a simulation can and do
merely detect the presence of a simulator, even if they also detect the generic
type of breach the simulator is intended to represent or impersonate.
Sadly, simulations are not good indicators of how
good a product is at detecting real malware, still less of how it compares in
performance to other products.
References
[1] Harley, D. The (Testing) World Turned Upside
Down. Virus Bulletin International Conference 2017. https://geekpeninsula.wordpress.com/2017/10/09/virus-bulletin-conference-paper-2017/
[2] Gordon, S. Are Good Virus Simulators Still A
Bad Idea? 1996. http://www.sciencedirect.com/science/article/pii/1353485896844047
.
[3] Virus News International, November 1993,
pp.40-41, 48. http://www.softpanorama.org/Malware/Reprints/virus_reviews.html
[4] Wells, J. et al. Open Letter. 2000: https://www.cybersoft.com/static/downloads/whitepapers/Open_Letter.pdf
[5] Ragan, S. Cylance accuses AV-Comparatives and
MRG Effitas of fraud and software piracy. CSO Online. 2017. http://www.csoonline.com/article/3167236/security/cylance-accuses-avcomparatives-and-mrg-effitas-of-fraud-and-softwarepiracy.html
[6] Harley, D. Untangling the Wheat from the Chaff
in Comparative Anti-Virus Reviews. Small Blue-Green World. 2007. https://geekpeninsula.files.wordpress.com/2013/09/av_comparative_guide.pdf
[7] Harley, D.; Myers, L.; Willems, E. Test Files
and Product Evaluation: the Case for and against Malware Simulation. AVAR
Conference 2010. http://www.welivesecurity.com/media_files/white-papers/AVAR-EICAR-2010.pdf