9.6.17


Testing, marketing, and rummaging in the FUD banks

Early in 2017, Kevin Townsend invited me (among others) to comment on vendor hype (in general, not with reference to any particular vendor). You can find the article he eventually wrote for Security Week – including some comments from me – here: Fighting Cyber Security FUD and Hype. He led with a couple of quotes from Ian Levy, Technical Director of the UK’s National Cyber Security Center (NCSC), accusing the security industry of (in Kevin’s words) ‘overhyping the cybersecurity threat to sell under-achieving products’. I certainly wouldn’t discourage you from reading the article, in which a wide range of people at the sharp end of vendor marketing express their opinions.
However, since he didn’t use all my commentary – it’s a bad habit of mine, providing journalists with more wordage than they actually need! – and since it’s a topic to which I’ve devoted quite a bit of thought (and wordage) over the years, I thought I’d expand here on my original commentary. Since this is a vendor blog, I’ll forgive you if you are skeptical of my ability to be impartial. However, one reason I’ve worked with ESET for so long is that the company is scrupulous about allowing researchers to stay hands-off from marketing, so I don’t normally talk about specific products (ESET’s or anyone else’s). That doesn’t mean that I don’t understand the need for marketing, though.
A better mousetrap
I’m going to quote myself (actually, myself and former colleague Randy Abrams), from a paper written for an AVAR conference a few years ago on – among other things – ethical and professional conduct in security writing):
What are the cornerstones of success in the anti-malware industry? Sound technology, of course. Sound marketing, too: a good product is of little use if no-one knows enough to go out and buy it. There are other factors too: sound after-sales and support service, for instance, influences the deployment, maintenance and efficacy of a product or service.
They say that if you invent a better mousetrap, Amazon users will beat a path to your door. Or something like that. The trouble is, that the security world is packed with products that have their own merits – well, there are some that have no merit at all, but I’m not about to name names – but the average consumer isn’t in a position to evaluate those comparative merits and demerits. Nor, to be honest, are some reviewers and testers, but some have made a highly commendable effort to raise the standard of testing (dramatically, in some cases). I do believe that AMTSO deserve credit for facilitating that rise in standards. (I’ll come back to testing shortly.)
Still, there’s acceptable, ethical marketing, and then there’s hype.
Kevin Townsend suggested (and I don’t disagree) that there are two main varieties of vendor hype:
1.     We are the best thing since sliced bread and will solve all of your problems
2.     Our competitors are totally useless
The 100% mousetrap
If you’re familiar with my writing, you won’t be surprised to hear that I regard marketing based on the promise of 100% protection in perpetuity to be not only ethically suspect (and usually technically indefensible) but actively dangerous. It’s an attractive idea, the hope that once you buy a given product, all your security problems will be over. But it’s not the world we live in.
Some marketing – well, a lot of marketing – in the security sector takes the position that all you have to do is buy X, and then you never have to take any of the responsibility for your own security. Such claims are not necessarily based on deliberate deception. Often, they are based on lack of technical knowledge on the part of people with a marketing agenda. And often they are not: in combination with the ‘slag off the competition’ approach so beloved of startups in the security sector, half-truths and misconceptions are often initiated and spread by people who are clearly knowledgeable enough to know better.
Emotive language
To address another point Kevin raised, there’s no doubt that emotive language relating to warfare and/or epidemiology has long been a staple of security-related marketing (and journalism!). I can’t say I like it – much of my career in security has been devoted to damping down the fires of hyperbole and advocating less drama and more precision – but I’m most concerned by the misuse of language in ways that are actually deceptive rather than just everyday sloppiness.
It annoys me (quite disproportionately) when a system is described as ‘infected’ when ‘compromised [by some form of trojan]’ would be more accurate, or when people use ‘virus’ as a synonym for ‘malware’. That may seem just pedantry – well, it is pedantry – but it also irritates me in that use of the words ‘virus’ and ‘infect’ can be used to set up the reader to expect the cyber-equivalent of Ebola, where much malware is closer to the common cold. I also regard it as frankly deceptive when people evade the distinction between ‘successful attacks’ and ‘attempted attacks’ because saying ‘ten million systems were attacked’ is more dramatic and makes a marketing point more effectively than adding that ‘0.001% of attacks were actually successful’ …
This doesn’t mean that I underestimate the possible impact that some attacks, such as ransomware or phishing fraud, can have on individuals or organizations. After the WannaCryptor (WannaCry) attacks, no-one should be in any doubt as to the potential ransomware has for harm. And yes, I do think a good, properly configured security product is a sound investment when it comes to protecting your data from ransomware, but it’s equally important to maintain frequent offline backups so that it’s less likely to be disastrous if malware manages to get past security software.
Testing and evaluation
But let’s talk about testing, since it’s a subject much on my mind at this moment, because of a conference paper I’ve been working on for several weeks. (More about that another time …)
While test results are among the tools that marketing departments use for marketing leverage, they’re not the whole toolbox. However, test results do provide leverage, and it’s probable that more people get their test result information filtered through vendor press releases (directly or indirectly) than they do from an independent tester.
Probably any security researcher has seen instances where selective quoting and paraphrasing of test report content has made it sound far more favorable to a given product than is supported by the test statistics. Where a range of products of broadly similar abilities are tested, testers already face the issue of basing a report on the magnification of minor differences. On the other hand, we’re seeing an ongoing move away from large sample sets to far smaller sets, which is – provided the samples are properly selected, validated and weighted – not a bad thing. (However, we’re also seeing a resurgence of the bad old idea that testing is so easy that anyone can do it, with guidance and off-the-peg samples from vendors or their associates…)
Try before you buy
An idea that comes up several times in Kevin’s article explores the necessity of ‘trying before you buy’. Inevitably, since much of the market consists of products whose detection performance may not differ dramatically from that of their competitors, on average, so while detection test results are an important component of marketing strategies, other performance issues may be as important, or even more important.
The trouble with ‘try before you buy’ is that you only get to experience the product in a limited range of situations, and the less you know about the threat and counterthreat technology, the less you’ll get out of a trial period. While it’s perfectly reasonable to try to benefit from the experience from others in forums and through personal discussion, your indirect experience is filtered through a third party’s experience and understanding, and that party’s needs may be quite different to yours.
There is a point to evaluation/trial versions, of course. For instance, an independent tester may, perfectly reasonably, award stars or a best-of-breed classification to a product that performs outstandingly in its tests, but you might then find that it’s totally incompatible with one of your in-house systems or a third-party product that is vital to your business processes. I don’t think this is a common occurrence, but it could happen, because no tester can guarantee that a product they test will work flawlessly in every environment.
When it comes to detection testing, however, the accuracy of your evaluation is dependent on (among other things) the validity of your test set. The real value of the test is determined (among other things) by such complex and interrelated factors as the selection and validation of samples, weighting/scoring, appropriate configuration and so on. Keeping those particular juggling clubs in the air requires skill, knowledge, and resources. If you buy into the model of ‘testing is so easy, you can do it yourself‘, pushed so hard by certain security companies over the years, you’re less in control of the methodology than they are.
If you simply rely uncritically on a sample set from unknown sources (especially if those sources may be a vendor whose product is under test), you’re letting the supplier of the samples control the test. And if you know that the source of samples for testing is a vendor whose product you’re evaluating? Well, I just don’t understand why you would do that.
·         The vendor can’t give you samples it doesn’t have.
·         The vendor is unlikely to give you samples its products don’t detect.
·         The vendor will be tempted to give you samples it knows other vendors won’t detect.
In fact, some professional testers have been known to solicit samples from vendors and encourage those vendors to offer samples that other vendors probably don’t have. And I can see some point to that, though it’s not a point that works to the advantage of the tester’s audience. It can happen with a reasonably well-weighted and selected set of samples that a selection of reasonably competent products will score around the same number of detections. (Can and does happen.) So a tester (or reviewer – not always the same person) may feel compelled to exaggerate small differences in detection rates in order to get a more dramatic detection ranking for the ‘editor’s choice’.
But why don’t other companies have the same samples? Well, here are some possible (hypothetical) reasons:
·         Unlike most mainstream vendors, the vendor doesn’t share samples with other companies. Established AV companies have tended to prioritize the good of the community rather than competitive advantage, by sharing samples. Maybe not always all samples, and maybe not always immediately.
·         The sample wasn’t around long enough to pose a threat in the real world.

·         The sample was generated specifically to offer to testers and potential customers in the hope and expectation that only the donor’s product would detect it. There was a recent case where a high percentage of samples were found not to contain malicious code. While this is claimed to have been a side-effect of a process intended to generate ‘new’ samples of pre-existing malware rather than deliberate deception, it demonstrates how misleading the results can be when samples used for testing and evaluation are not properly validated before use.

https://www.welivesecurity.com/2017/06/08/testing-marketing-rummaging-fud-banks/?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+eset%2Fblog+%28ESET+Blog%3A+We+Live+Security%29