It’s no secret that ‘Big Data’ is becoming a ‘Big Problem’ for organizations from a data and storage management perspective. However what organizations may fail to realize is that the best way to solve their Big Data problems is NOT by mindlessly throwing more resources at them. Rather it is to look at Big Data more strategically and then tackle the data management problems it creates in one fell swoop using software like CommVault® Simpana® and its OnePass technology.
The current and forecasted growth of Big Data in organizations is well documented. New forces such as manual and machine-generated data and lengthening regulatory requirements are only part of what is contributing to organizations having to create and maintain ever larger data stores. Additionally, organizations are pulling in and having to manage data across multiple sources such as scalable file systems, desktops, laptops and public and private clouds, just to name a few, which is resulting in an expanding digital universe that they need to manage ever more efficiently.
If that is not enough, the policies that many organizations have in place are at best poorly administered resulting in them retaining much more data and for longer periods of time than needed. Companies further aggrevate the situation by creating multiple silos of duplicate data through their use of separate archiving and backup software products. Reasons like these are why one analyst firm recently forecast that digital archive capacity would nearly quadruple between now and 2015.
So while it appears data growth is inevitable, organizations can take steps now to control this data growth to limit Big Data’s impact on them. Three steps they should consider taking immediately are:
- Start to view data and storage management from a strategic perspective in order to reduce silos and duplicate processes
- Make hardware and software acquisitions in light of these more strategic objectives
- Implement a central data management platform to deliver on them
While most organizations would agree in principle with these three steps, identifying and then implementing the right data management platform to manage their growing data stores may be tricky.
On one end of the spectrum are products that take a federated approach to data management. Vendors provide separate software products to deliver a full complement of data managements features (archive, backup, reporting, search and data movement.) However the vendors only offer “one product” in the sense that they are available from one vendor. Beyond that, they have their own agents, catalogs and data stores so they do not fully address the challenges of managing Big Data.
Conversely, on the other end of the spectrum, are products that only deliver on one or a couple of these features. These then require organizations to acquire multiple products to deliver on the data management features they want. This approach also usually results in them aggravating data management problems by again creating different catalogs, data stores and policy engines.
This is where CommVault Simpana software has and continues to differentiate itself. It delivers the core data management features that companies want in a single product while enabling them to centrally administer this data without the pitfalls that other approaches can create.
Key ways that CommVault software differentiates itself include:
- OnePass technology. To archive, backup, report or search data stores, data first has to be scanned. Other software applications may have to complete this scan for each operation – i.e. scan, backup, scan, archive, scan, report. This is especially troublesome in the era of Big Data as each scan could potentially take days.
Simpana 9 eliminates this concern with its OnePass feature that provides a single, consolidated agent that indexes and catalogs file data once. This single catalog is then shared, accessed and used by each of Simpana software’s archiving, backup, reporting and search components which eliminates the time and effort needed to manage, move and access data stored for archive and backup processes.
- Single data store. It is common for archive software to have its own data store and backup software to have its own data store, sometimes even if both archive and backup software are obtained from the same vendor. Simpana eliminates this redundancy by storing all managed data across backup and archive in a single, scalable, hardware-independent virtual repository called the ContentStore.
Using its shared catalog and single ContentStore, Simpana software both controls and tracks what data resides where. So a file will reside on primary storage and be backed up until the policy to move it to archive storage kicks in where it is retained according to the user’s needs. It no longer consumes primary storage or contributes to the time and resources required to protect the production file system and all copies of the file are searchable, whether they were created for either process. Its embedded, global deduplication feature further contributes to reducing the size of data stores by recognizing like chunks of data across different processes and only storing them once.
- Single policy engine. Organizations that use and implement data management software typically expect to take on a more proactive role in the management of their data. For example, if they no longer need certain files, they may delete them. However, the gap most data management software leaves is that all the copies of these files that may reside in archives and backups are probably not automatically deleted.
Simpana software closes this data management gap by using a single policy engine that references its catalog to track where all data is located – whether in an archive or in a backup. For example, delete and purge can take place with OnePass technology archive operations such that when a user or application deletes a stub, OnePass can, via policy, remove the file from the archive or keep it for an extended period of time. This allows archived files to be removed from the archive without having to delete an entire job or data set..
- Optimized data placement on storage media. Organizations have more storage tiers than ever before from which to choose – cloud, tape and multiple tiers of disk, just to name a few. However getting the right data on the right tier at the right time using multiple data management products and processes is just about impossible.
Simpana 9 again addresses this concern. Using a single product, files in archive, backup and production storage can be managed across multiple tiers of storage including the aforementioned cloud, tape and tiers of disk. By simply setting policies in Simpana software, it can then dynamically and automatically move data to any of these various tiers of storage, still retrievable and accessible at any time.
Throwing more “cheap” storage capacity at the Big Data challenge is really no solution at all if growing data stores are to remain accessible, searchable, understandable and manageable. No matter how “cheap” storage is or becomes, the growth of Big Data is far outpacing the time and resources needed to protect and manage it so eventually the cost of NOT managing Big Data will bite every organization.
The sooner organizations view Big Data strategically, the sooner that they can take the steps necessary to manage it in a cost effective, efficient and timely
manner. However this will only occur by putting in place a data management solution that can centrally manage and consolidate their growing file systems with a single integrated tool and a single data repository to complement it. Using CommVault Simpana software with its ContentStore, OnePass technology, embedded deduplication, single policy engine and ability to optimize data placement will enable organizations to do exactly that.