Data Integrity, Physical Security and REST APIs Contribute to Tape’s Ongoing Relevance in the World of Big Data

By January 8, 2014 Tape Systems No Comments

Though no one would make the statement that tape as a storage medium will ever leapfrog over disk again as the preferred method of data storage, it can be said with confidence that one of the oldest computer storage medium is holding steady in its current niche and is here to stay, at least for the foreseeable future.

Expansive tape libraries have remained a necessity as the Big Data market grows ever larger each year. An interesting illustration of this growth is that while tape sales dropped by 14% in 2012 overall, sales actually rose by 1% in the third quarter of 2012, and some analysts expect them to increase again by at least 3% in calendar year 2013. The amount of data growth is becoming exponentially greater with small, medium and large enterprise organizations alike generating much more data and storing it to tape than ever before.

The benefits of tape over disk for long-term storage are well-documented, but worth repeating again here to reinforce why it is that tape is still necessary, and why its allure will hold steady or possibly even grow slightly as the need for storage expands.

Tape is good at leaving data at rest. This makes the cost structure of tape particularly attractive to users with large data sets that, once created, stay relatively static. Most Big Data datasets constantly grow. But once data is added, that data rarely changes. On the disk side, it can present disadvantages to the storage of static data because the operational costs of keeping disk drives spinning are much greater than tape’s inherent state of “set it and forget it.”

Other benefits which tape still holds strong are its security and reliability. Data backed up to tape is typically more secure than data stored on disk or on the cloud because, in part, tape can make data more difficult to access and then retrieve. The average hacker is more likely to spend time trying to hack data stored in a cloud or on disk than to go to the trouble of breaking into a storage facility where tape backups are stored, retrieve those tapes, load them into a tape library, and then go through them linearly to find and access the data they store.

Lastly, tape libraries are still more physically reliable than disk. For instance, tapes can be repaired after snapping making the data accessible again. In contrast, a failed hard disk drive (HDD) is often rendered useless with the data on it becoming inaccessible.

Putting this contrast into perspective, data loss could range from a lost terabyte on a failed HDD while on a repaired tape the lost could be limited to as little as a few hundred megabytes lost. , the European Organization for Nuclear Research that operates the world’s largest particle physics laboratory, is a perfect example of the effective difference between data loss from tape backup versus disk storage. While a few hundred MBs are lost each year out of the organization’s 100 PB tape library, CERN loses about a few hundred TBs during the same period from its hard disk storage repository of 50 PB.

Aside from the various legacy advantages which tape still holds over disk, there are some new advances being made in the tape storage medium, most notably with REST APIs, which are one of the new features identified and scored by the forthcoming DCIG 2014-15 Big Data Tape Library Buyer’s Guide. REST uses a subset of HTTP, a protocol with which many programmers are already familiar, meaning they can write code for it with few complications. Furthermore, REST APIs are web-based services and protocols that make up much of the underpinnings of new public and private cloud datasets.

Roughly one out of every four of the models surveyed for this Buyer’s Guide now support the ability to store data directly to tape using REST. As a way to ease the initial ingest of data and migration between different tiers of data storage, at least one vendor, Spectra Logic, is simulating the Amazon S3 REST protocol. Additionally, nine models further support the ability to store data to secondary cloud storage. The combination of REST APIs with cloud storage being offered more closely alongside tape storage is a significant step forward for the tape storage industry.

The forthcoming DCIG 2014-15 Big Data Tape Library Buyer’s Guide also takes into account LTFS (Linear Tape File System). Although an important advancement in tape technology, this feature has not contributed to the expansion of the medium as much as one might have been expected. It does appear, however, that REST APIs are set to be the next “big thing” in the near future and will help to further cement tape as the optimum choice for organizations tasked with storing and keeping safe massive quantities of data.

Ben Maas

About Ben Maas

Senior Analyst for DCIG. Linux Kool-Aid Drinker. Twins Groupie. Fascinated by anything with silicon wafers.

Leave a Reply