A question that gets raised almost every time that DCIG releases a Buyer’s Guide is, “Why are performance metrics not included in the Guide or considered in its evaluation of the products?” While DCIG has answered this question in various ways and in a number of forums over the last few years, we thought it appropriate to aggregate those randomly posted responses into a more definitive blog entry to address this particular question as it inevitably comes up.
There is no one reason as to why DCIG does not include performance metrics or attempt to measure them when it puts out its Buyer’s Guides. Rather there are seven that DCIG cites that preclude DCIG from currently measuring or reporting on performance in its Buyer’s Guides and having any formal plans to do so in the future.
1. Too many products to test in too short a time frame. This is probably the number one reason why DCIG does not test the performance of any products that it includes in its Buyer’s Guides. Most DCIG Buyer’s Guides cover at least 20 products and many cover 50 or more.
20 is usually the minimum number of products that DCIG requires to be shipping in a particular space before it even considers doing a Buyer’s Guide on a particular topic. However to test even 20 products could take up to five months to complete as it could take up to a week to properly test each product.
This assumes that the original test plan was agreed upon and does not change once the testing begins which is a big “IF.” Even then, all of the equipment needs to arrive onsite, arrive on time and be properly configured – more big “IF‘s” when doing testing. Then the person conducting the tests must be free to work uninterrupted for twenty weeks straight which is almost unheard of in this day and age.
2. Too difficult to construct a performance test that accounts for all of the variables that multiple organizations find relevant. Let’s just assume for a second that DCIG could get twenty products on site and test all of them. Before DCIG would even ask for them to be shipped to its site for testing, it would need to construct a meaningful test plan.
Assuming DCIG is testing storage arrays, should DCIG run mixed workloads against the array? Or should it just run performance tests using an Oracle database? A SQL Server database? All of the above? This does not even take into account if these applications are residing on virtual or physical machines. In short, it could take weeks, months or even years to simply develop a meaningful test plan.
3. It is all fun and games until someone loses an eye. Testing always sounds great until someone drives a forklift through the side of a storage array. You may laugh but I am aware of such a occurrence taking place a number of years ago at one company’s site when they brought in a storage array to test.
Suddenly all talks about testing stopped and everyone had to spend a lot of time figuring out who was liable and how it would get resolved. I never did hear about the final outcome but it illustrates an important point.
Things get dropped, broken and misplaced all of the time. When testing 20+ products from multiple vendors you can just about guarantee something unexpected like this is going to happen. Resolving the problem may take longer than running the tests themselves.
4. Some EULAs prohibit the publishing of performance benchmarks. This one was just brought to my attention a few months ago. While not every vendor has such a statement in its end user licensing agreement (EULA) it is worth noting that some do.
EULAs like this alone would preclude DCIG from testing and evaluating some products in its Buyer’s Guides. It also raises troubling questions as to how any “independent” performance or functional tests of these products can be done by any third party publication and analyst firm if such wording in a EULA exists.
5. No agreed-upon industry standards for measuring and publishing performance results. Let’s even go further out on a limb and say DCIG did line up all of the products to test, had a testing plan to which everyone agreed upon, we knew in advance all of the testing would go off without a hitch and there were no EULA restrictions. The questions then become, “What performance results do I publish?” and “What tool or tools do I use to measure the performance?“
This becomes a whole new topic of debate. Granted, there are some tools like Iometer that are used in lieu of any standards but I know even the results of these tests are debated. Absent any commonly accepted benchmarks as to what constitutes “acceptable performance benchmarks,” and how products should be configured to achieve them, any published results by a third party would likely be viewed with skepticism anyway.
6. Results will vary depending on the skill and knowledge level on the part of the individual conducting the test. The results of every test are almost always somewhat dependent on the skills and knowledge of the individual conducting the test. So even if a “neutral” third party is conducting the test (not an employee of the vendor,) odds are that person may be more familiar with one system or a certain subset of systems than others.
Translated, that person may be able to configure a “good” system to outperform a “great” system just because he or she is more familiar with it. This unintentional bias would almost certainly come into play when testing dozens of products from multiple vendors.
7. Third party performance benchmarks are usually conducted in very favorable testing conditions. An alternative DCIG could pursue is rather than trying to do testing itself is to simply rely on the performance results published by other third party organizations (of which there are a few) or even by the vendor. While this sounds intriguing on the surface and good in theory, this approach is just as problematic.
In most cases, the test is conducted with the vendor providing the product, having its personnel onsite to run the test and then having multiple opportunities to rerun the test to tune its system for optimal performance results. Whether or not one agrees with this approach is besides the point. What is in question is how well these performance results will play out in the real world when all of these “helps” are not in place.
The topic of performance is certainly an important one to end users and the intent of this blog entry as well as the Buyer’s Guide’s is in no way to diminish the importance of doing performance testing or suggest that organizations should not factor performance into their decision making process. If anything one purpose of all of the DCIG Buyer’s Guides is to shorten the amount of time that organizations need to spend doing research so they can more quickly get around to actually testing products themselves in their environment.
It is for these reasons that DCIG has to date refrained from doing and/or reporting on any type of performance benchmarks in its Buyer’s Guides. This is also why that DCIG believes that performance testing is still best left to individual organizations to perform and why DCIG has no formal plans to do performance testing or measure performance on its own.