Testing, marketing,
and rummaging in the FUD banks
By David Harley
Early in 2017, Kevin Townsend invited me (among
others) to comment on vendor hype (in general, not with reference to any
particular vendor). You can find the article he eventually wrote for Security
Week – including some comments from me – here: Fighting Cyber Security FUD and Hype. He led with a couple of
quotes from Ian Levy, Technical Director of the UK’s National Cyber Security
Center (NCSC), accusing the security industry of (in Kevin’s words) ‘overhyping
the cybersecurity threat to sell under-achieving products’. I certainly
wouldn’t discourage you from reading the article, in which a wide range of
people at the sharp end of vendor marketing express their opinions.
However, since he didn’t use all my commentary –
it’s a bad habit of mine, providing journalists with more wordage than they
actually need! – and since it’s a topic to which I’ve devoted quite a bit of
thought (and wordage) over the years, I thought I’d expand here on my original
commentary. Since this is a vendor blog, I’ll forgive you if you are skeptical
of my ability to be impartial. However, one reason I’ve worked with ESET for so
long is that the company is scrupulous about allowing researchers to stay
hands-off from marketing, so I don’t normally talk about specific products
(ESET’s or anyone else’s). That doesn’t mean that I don’t understand the need
for marketing, though.
A better mousetrap
I’m going to quote myself (actually, myself and former
colleague Randy Abrams), from a paper written for an AVAR conference a few
years ago on – among other things – ethical and professional conduct in
security writing):
What are the cornerstones of success in the
anti-malware industry? Sound technology, of course. Sound marketing, too: a
good product is of little use if no-one knows enough to go out and buy it. There
are other factors too: sound after-sales and support service, for instance,
influences the deployment, maintenance and efficacy of a product or service.
They say that if you invent a better mousetrap,
Amazon users will beat a path to your door. Or something like that. The trouble
is, that the security world is packed with products that have their own merits
– well, there are some that have no merit at all, but I’m not about to name
names – but the average consumer isn’t in a position to evaluate those comparative
merits and demerits. Nor, to be honest, are some reviewers and testers, but
some have made a highly commendable effort to raise the standard of testing
(dramatically, in some cases). I do believe that AMTSO deserve credit for
facilitating that rise in standards. (I’ll come back to testing shortly.)
Still, there’s acceptable, ethical marketing, and
then there’s hype.
Kevin Townsend suggested (and I don’t disagree)
that there are two main varieties of vendor hype:
1.
We are the
best thing since sliced bread and will solve all of your problems
2.
Our
competitors are totally useless
The 100% mousetrap
If you’re familiar with my writing, you won’t be
surprised to hear that I regard marketing based on the promise of 100%
protection in perpetuity to be not only ethically suspect (and usually
technically indefensible) but actively dangerous. It’s an attractive idea, the
hope that once you buy a given product, all your security problems will be
over. But it’s not the world we live in.
Some marketing – well, a lot of marketing –
in the security sector takes the position that all you have to do is buy X, and
then you never have to take any of the responsibility for your own security.
Such claims are not necessarily based on deliberate deception. Often, they are
based on lack of technical knowledge on the part of people with a marketing
agenda. And often they are not: in combination with the ‘slag off the
competition’ approach so beloved of startups in the security sector, half-truths and
misconceptions are often initiated and spread by people who are clearly
knowledgeable enough to know better.
Emotive language
To address another point Kevin raised, there’s no
doubt that emotive language relating to warfare and/or epidemiology has long
been a staple of security-related marketing (and journalism!). I can’t say I
like it – much of my career in security has been devoted to damping down the
fires of hyperbole and advocating less drama and more precision – but I’m most
concerned by the misuse of language in ways that are actually deceptive rather
than just everyday sloppiness.
It annoys me (quite disproportionately) when a
system is described as ‘infected’ when ‘compromised [by some form of trojan]’
would be more accurate, or when people use ‘virus’ as a synonym for ‘malware’.
That may seem just pedantry – well, it is pedantry – but it also
irritates me in that use of the words ‘virus’ and ‘infect’ can be used to set
up the reader to expect the cyber-equivalent of Ebola, where much malware is
closer to the common cold. I also regard it as frankly deceptive when people
evade the distinction between ‘successful attacks’ and ‘attempted attacks’
because saying ‘ten million systems were attacked’ is more dramatic and makes a
marketing point more effectively than adding that ‘0.001% of attacks were
actually successful’ …
This doesn’t mean that I underestimate the
possible impact that some attacks, such as ransomware or phishing fraud, can
have on individuals or organizations. After the WannaCryptor (WannaCry) attacks, no-one should be in any doubt
as to the potential ransomware has for harm. And yes, I do think a
good, properly configured security product is a sound investment when it comes
to protecting your data from ransomware, but it’s equally important to maintain
frequent offline backups so that it’s less likely to be disastrous if malware
manages to get past security software.
Testing and evaluation
But let’s talk about testing, since it’s a subject
much on my mind at this moment, because of a conference paper I’ve been working
on for several weeks. (More about that another time …)
While test results are among the tools that
marketing departments use for marketing leverage, they’re not the whole
toolbox. However, test results do provide leverage, and it’s probable that more
people get their test result information filtered through vendor press releases
(directly or indirectly) than they do from an independent tester.
Probably any security researcher has seen instances
where selective quoting and paraphrasing of test report content has made it
sound far more favorable to a given product than is supported by the test
statistics. Where a range of products of broadly similar abilities are tested,
testers already face the issue of basing a report on the magnification of minor
differences. On the other hand, we’re seeing an ongoing move away from large
sample sets to far smaller sets, which is – provided the samples are properly
selected, validated and weighted – not a bad thing. (However, we’re also seeing
a resurgence of the bad old idea that testing is so easy that anyone can do it,
with guidance and off-the-peg samples from vendors or their associates…)
Try before you buy
An idea that comes up several times in Kevin’s
article explores the necessity of ‘trying before you buy’. Inevitably, since
much of the market consists of products whose detection performance may not
differ dramatically from that of their competitors, on average, so while
detection test results are an important component of marketing strategies,
other performance issues may be as important, or even more important.
The trouble with ‘try before you buy’ is that you
only get to experience the product in a limited range of situations, and the
less you know about the threat and counterthreat technology, the less you’ll
get out of a trial period. While it’s perfectly reasonable to try to benefit
from the experience from others in forums and through personal discussion, your
indirect experience is filtered through a third party’s experience and
understanding, and that party’s needs may be quite different to yours.
There is a point to evaluation/trial
versions, of course. For instance, an independent tester may, perfectly
reasonably, award stars or a best-of-breed classification to a product that
performs outstandingly in its tests, but you might then find that it’s totally
incompatible with one of your in-house systems or a third-party product that is
vital to your business processes. I don’t think this is a common occurrence,
but it could happen, because no tester can guarantee that a product they test
will work flawlessly in every environment.
When it comes to detection testing, however, the
accuracy of your evaluation is dependent on (among other things) the validity
of your test set. The real value of the test is determined (among other things)
by such complex and interrelated factors as the selection and validation of
samples, weighting/scoring, appropriate configuration and so on. Keeping those
particular juggling clubs in the air requires skill, knowledge, and resources.
If you buy into the model of ‘testing is so easy, you can do it yourself‘, pushed so hard
by certain security companies over the years, you’re less in control of the
methodology than they are.
If you simply rely uncritically on a sample set
from unknown sources (especially if those sources may be a vendor whose product is under test), you’re letting
the supplier of the samples control the test. And if you know that the
source of samples for testing is a vendor whose product you’re evaluating?
Well, I just don’t understand why you would do that.
·
The vendor
can’t give you samples it doesn’t have.
·
The vendor is
unlikely to give you samples its products don’t detect.
·
The vendor
will be tempted to give you samples it knows other vendors won’t detect.
In fact, some professional testers have been known
to solicit samples from vendors and encourage those vendors to offer
samples that other vendors probably don’t have. And I can see some point to
that, though it’s not a point that works to the advantage of the tester’s audience.
It can happen with a reasonably well-weighted and selected set of samples that
a selection of reasonably competent products will score around the same number
of detections. (Can and does happen.) So a tester (or reviewer – not always
the same person) may feel compelled to exaggerate small differences in
detection rates in order to get a more dramatic detection ranking for the
‘editor’s choice’.
But why don’t other companies have the same
samples? Well, here are some possible (hypothetical) reasons:
·
Unlike most
mainstream vendors, the vendor doesn’t share samples with other companies.
Established AV companies have tended to prioritize the good of the community
rather than competitive advantage, by sharing samples. Maybe not always all
samples, and maybe not always immediately.
·
The sample
wasn’t around long enough to pose a threat in the real world.
·
The sample
was generated specifically to offer to testers and potential customers in the
hope and expectation that only the donor’s product would detect it. There was a recent case where a high percentage of samples
were found not to contain malicious code. While this is claimed to have been a side-effect of a process intended to
generate ‘new’ samples of pre-existing malware rather than deliberate
deception, it demonstrates how misleading the results can be when samples used
for testing and evaluation are not properly validated before use.