On the server virtualization side, VMware vCenter has emerged as a central console that first detects and then centrally manages VMware VMs across their environment. On the storage side, similar storage array management software like NEC Storage Manager is now available to complement VMware vCenter as it discovers NEC D, M and S Series storage arrays and then administers their advanced storage software features.
According to IDC, revenue from external disk storage systems totaled over $18 billion in 2010. But what that IDC number does not fully reflect is the growing impact that midrange arrays are having on organizations of all sizes and how well they are positioned to deliver the other key feature that organizations now want in their virtualized environments: Reliability. Among the midrange arrays available, the new NEC M100 storage array is better positioned than most to deliver on these two features.
Virtualization is sweeping through data centers of all sizes and, as it does, it introduces levels of complexity that organizations are ill-equipped to handle. To mitigate this, reference architectures are emerging as a technique to standardize which hardware and software are deployed, under what circumstances, and how it is managed.
Dedupe is an easy concept to grasp. At its most basic level it reduces storage requirements and touts the improvement in backup and recovery times. It seems as if it is a “win-win” scenario and, for the most part, it is. But let’s not lose sight of the fact that dedupe is still in its infancy and is being continually fine-tuned and changed. This should keep us from becoming lackadaisical in our perception of this technology and how it is still in its early stages.
Recently Kelly Polanski (another DCIG analyst) and I had a rather lengthy discussion about the value of keeping archive and backup data on disk versus tape long term. We were both in agreement that using disk in some form as an initial backup target makes sense in most environments but as we started to debate the merits of keeping data on disk versus tape long term, the issue can get more cloudy. While DCIG has previously argued that eDiscovery is becoming a more compelling reason to keep archive and/or backup data on disk long term, the concerns we had centered on the fact that some disk-based archival and backup storage systems can become as problematic as tape.
Over the last few months DCIG has spent fair amount of time researching and documenting specific reasons why tape will not die. Green IT is the one reason we most often hear cited for retaining tape, though new disk-based deduplication and replication technologies coupled with new disk storage system designs that are based on grid storage architectures can offset some of those concerns. So before organizations think that after 30, 90 or 180 days that they should immediately move their archival and backup data, deduplicated or otherwise, from disk to tape just to save money, there are certain intangible savings from an eDiscovery perspective that keeping data on disk provides that are not always feasible on tape.
Almost 3 years ago now, Robin Harris over at Storagemojo.com starting posting the list prices for different vendor’s products so customers have at least a starting point when comparing product prices. Though I suspect the list prices associated with these vendors’ offerings have changed since he originally posted some of them, what I specifically found remarkable is how difficult it is to ascertain what a deduplication solution will cost for an organization. The difficulty in pricing deduplication solutions had less to do with making sure you getting deduplication than making sure you include in your configuration all of the options that your environment needs, such as failover, NAS or VTL interfaces, data retention periods or replication, to effectively compare different solutions.
Innovation within the data center seems to be on the lips of IT managers, vendors, and analysts alike. Innovation, it is said, will pull us through this economic downturn even as organizations experience cutbacks in budgets, staff and just general doom and gloom. These innovations include maturing technologies such as virtualization, grid computing and deduplication coupled with management initiatives like consolidation, outsourcing and reduced expansion. These ensure organizations can continue to cut costs and stay on budget while creating more efficient data centers that are ready for whatever tomorrow brings.
Are deduplication guarantees really something you can take to the bank? As more companies look towards using disk in general as a backup target and deduplicating systems specifically, deduplication guarantees are emerging as a way to influence users’ decision to deploy deduplicating systems. But in these tightening economic times, deduplication guarantees do not necessarily guarantee money in the bank and may shift your attention away from more critical evaluation criteria such as system reliability, scalability, and performance.
Having managed multiple types of storage systems from multiple different storage vendors, there are two flaws that are common across many vendors’ storage systems: the inability to transparently migrate data to subsequent generations of their own hardware and the inability to share administrative permissions with other like storage systems from that vendor. How acute this problem is depends on how many storage systems a company manages and how often it replaces them. However any administrator that is responsible for managing five, ten or more storage systems in today’s enterprise corporations understands exactly what I am talking about.
NAS is sometimes viewed as a challenge by enterprise shops if their intent is to use it as a target for disk-based backup. Two reasons often cited is that there is only a finite amount of storage capacity available on NAS and backup software does not handle out-of-space conditions on file systems very well. This causes failures in backup jobs as well as performance bottlenecks when multiple backup jobs are occurring . The use of grid storage architectures in products like the NEC HYDRAstor are helping to put some of these concerns to rest and making NAS a more practical option for use as a target for disk-based backup in enterprise shops.
When I recently attended VMworld 2008, I had the opportunity to get a closer look at NEC’s latest HYDRAstor release, the HS8-2000, and some of its features. Of course at a trade show all you generally have the time and opportunity to do is take a quick look at some of the product’s hardware and software features. But in this case there was a feature on the HYDRAstor that struck me just from the short time I spent evaluating it: the ability to create a 256 petabyte (PB) or larger file system.
The ease in which HYDRAstor’s underlying grid storage architecture gives companies to migrate to higher capacity and faster performing hardware found in its new HS8-2000 make it easy to overlook some of its other new features. Part of the reason I devoted the last blog entry to HYDRAstor’s self-evolving architecture is because I usually have to do just the opposite: educate readers about the advantages of upgrading to a new product so they can justify the pain of going through the migration. In HYDRAstor’s case, it is so painless to upgrade and migrate to the new HS8-2000 release that it is almost easy to overlook its new features.
A self-evolving platform is one of the promises behind products like the NEC HYDRAstor that are based on grid storage architectures. Grid storage architectures automatically take over data migrations during technology refreshes which eliminates the need for application downtime or for companies to do forklift upgrades. Yet up to this point it was difficult to establish the validity of that promise for the NEC HYDRAstor since its HS8-1000 series was still in its first release.
Replication and deduplication are features that are fast becoming necessities when disk libraries are introduced into enterprise IT backup environments. But as I brought out in a previous blog entry, introducing multiple functions into disk libraries intended for enterprise caliber backup environments typically has some unpleasant trade-offs. A primary concern in enterprise IT shops is how large (or small) to initially configure the solution so companies neither overspend on oversized hardware nor purchase undersized hardware that cannot scale to meet their future requirements, so they need some way to forecast how their IT environment is going to look going forward.
The “all-in-one” concept is one of the hottest trends in consumer technologies. Just looking at the gadgets and devices that I use on a day-to-day basis in my office, I am hard pressed to find one that does not perform multiple tasks. My office phone supports two lines, has a separate voice message box for each line and tracks all of my incoming and outgoing calls. My printer is not just a printer. It prints, copies, scans and faxes. Then, of course, there is my Blackberry which acts as a cell phone, email client, web browser, calculator, personal organizer (contacts/phone book) and a host of other functions that I have not even had time to figure out yet.
Anyone who thinks tape is still the right primary target for backup only needs to watch a video on NEC’s website that includes a testimonial from Orlando, FL, based TLC Engineering. In this testimonial, TLC shares some of its experiences using tape as its primary target for backup and recovery and the hassles associated with it. The situations that the individuals on the video describe are almost comical but, from past experience, I know that TLC’s experiences are more common than not.
However my intent is not to leave readers hanging or fretting as to what storage systems they can select that take this problem into account. The NEC HYDRAstor is one such product that has taken steps to address this issue. HYDRAstor includes a feature called Distributed Resilient Data™ (DRD) that is able to offer more protection than RAID 5 or RAID 6 without their rebuild performance drawbacks. Because HYDRAstor is based on a grid storage architecture, it can by default survive the failure of not only multiple disk drives but also multiple Storage Nodes. The default setting is 3 disk drives or 3 Storage Nodes if multiple nodes are present (based on the video on the HYDRAstor web site, it looks like a company needs at least 12 nodes for a company to have assurance it can recover from the failure of 3 different nodes).
Almost any disk-based solution – deduplicating or otherwise – is going to expedite backups and recoveries. Sure, some solutions may deduplicate better or do it faster but at the end of the day most companies are at the point that putting in place any disk-based system that supports replication and deduplication is better than dealing with the current backup pain. However what companies often fail to account for is how fast their backup data stores grow when they start backing up data to disk. More than once I’ve talked to system administrators in companies where “undisclosed” or “hidden” departmental application servers start to come out of the woodwork once department managers hear that corporate IT backup processes actually work.
The juxtaposition of deduplication and replication in disk-based backup appliances is a powerful combination that companies can use to protect backed up data across data centers as well as data backed up at remote and branch offices (ROBOs). Yet where deduplication ends and replication starts can get a little confusing in grid storage architectures such as is supported by the NEC HYDRAstor that features global deduplication capabilities.