In today’s information age our focus always tends to be on the here and now and how quickly we can access information that was made sometimes just seconds ago. But in terms of the total amount of data in the digital universe, that is just the tip of the iceberg with possibly as much as 90% of today’s data existing as archival data. Ensuring the integrity of that data and making sure it is stored cost effectively for decades is the responsibility of today’s new generation of tape libraries. In part 3 of my interview series with Spectra Logic’s CEO Nathan Thompson, we discuss how tape libraries have continued to mature to meet today’s new business demands for retaining archival data for even longer periods of time.
Jerome: How have tape libraries continued to mature – even in the last year?
Nathan: This is a story that is not very well told. 20 years ago disk drives had low reliability but they have tremendously improved over that time. In much the same way, tape libraries and tape applications have also tremendously improved, as has the reliability of tape media.
Today, the reliability of LTO drives and tape libraries are nothing short of spectacular. Development of new features, capabilities and intelligence in tape has continually been invested in and delivered upon year after year.
In that vein, I’ll speak to a feature that Spectra Logic put out a year and a half ago that really became deployed in customer environments over the last 12 months.
We built a feature into our libraries called Data Integrity Verification. Here is what that is: if a user writes a tape on Tuesday, the library itself will load that tape in a separate tape drive on Wednesday , and conduct a quick read verification to confirm that there are no errors that cannot be corrected by the tape drive’s integrated error correction system as the data is written to the drive.
Our T-Series tape libraries can be configured to verify data integrity every six months, or every year, or every five years from that point forward. So a verification and validation system is now built into our tape libraries, at no cost to the user.
We also announced in November 2011 a new technology for tape media health assurance called CarbideClean. CarbideClean does an initial cleaning of “Green” tapes that have never been written to before they are deployed in the customer environment. This reduces debris on the heads of the tape drives as well as decreases needs for tape drive cleaning.
This CarbideClean process has in fact also resulted in an improvement in actual tape capacity and performance. It is a relatively small increase, maybe a two to three percent capacity improvement coupled with a five to eight percent improvement in speed using this innovation.
The concept of pre-cleaning tape media was brought to our attention by a large customer who observed some characteristics of continuous tape use so we built a process into our tape libraries to address it.
Another example of ongoing Tape Library innovation is improved usability features. Tape libraries from decades ago were considered very hard to use and to manage. Useability features released in the most recent Spectra BlueScale 12 software in the last year continued in our efforts to make tape as easy to manage as disk.
BlueScale 12 included an XML interface. As our customers upgrade to BlueScale 12 (a free upgrade) they can interface, monitor and manage their tape library using XML as well as programmatically interface with it.
So if they want somebody monitoring a tape library for any variety of conditions that might occur in a data center, it’s very easy to do. These are just some of the innovations that have recently occurred in tape libraries.
Jerome: So as tape libraries offer these new features, what percentage of tape libraries is still being used in the traditional backup and recovery role and what percentage is being deployed in new capacities?
Nathan: I would estimate that, on a day-to-day basis, approximately 70 percent of tape libraries in the field are being used primarily to support backup and disaster recovery while the other 30 percent are being used to support archive.
However, if you look at the amount of data on tape libraries and what type it is, the percentages are probably the other way around. Probably 70 to 80 percent of the amount of information that is stored in the aggregate set of tape libraries that we have installed around the world is archival information and it may be as high as 90 percent. The rest would be backup data.
The growth in unstructured data over the past decade has dramatically increased the amount of data on tape for archive.Most of the really big libraries (over 5000 tape slot libraries) we have installed are being used for archival.
We have a T-Finity tape library at the Korean Meteorological Institute, which captures weather history for the southern part of the Korean peninsula. KMA’s Spectra T-Finity tape library is tied to a Cray supercomputer, and archives PB upon PB of weather history. The reason is that they run weather models that predict the weather in South Korea so they built a model that inputs the history of the weather and uses it to predict the future.
In their case they need to keep weather history forever. The weather history in Korea that was captured five years ago, or two years ago, or one year ago is being stored in our T-Finity tape library. 100 years from now that information is still going to be important and relevant– because weather will still be predicted.
The only way you can really predict the weather is to access historical climate models. In those climate models you have to plug in previous data and previous weather patterns to see if you are correctly predicting it. That’s one example of the kind of application that will store information forever.
We also have the National Archives and the Library of Congress as accounts, both of which are storing video information. They have large central libraries and are required to maintain information for the life of the republic plus 100 years. So, how best to store all of that data? On spinning disk drives? I hope not.
All of the airfoil design that every airplane uses, formerly known as the NACA. It does simulations and wind tunnels at NASA Ames and it keeps data forever. So there are an enormous number of applications like that and you just can’t realistically keep that information on disk.
In Part I of this interview series Nathan shares how and why Spectra Logic got its start in the tape business and what differentiates it from almost every other tape manufacturer even today.
In Part II of this interview series Nathan discusses why Spectra Logic decided to double down on tape even as many experts were forecasting its death.
In Part IV of this interview series, Nathan discusses why tape will remain an integral part of backup processes for a long time to come.
In Part V of this interview series, Nathan talks about what new features we can expect to see from tape and what new roles it will be able to assume in just a few years.