A trend that DCIG is seeing among more new products being introduced into the enterprise space is the proclivity to use the best of what has been previously developed in the past and combining that with new technologies that meet the emerging requirements of today’s organizations. The new VirtuoSO offering from Sepaton reflects this broader industry trend. In this second part of my interview series with Sepaton’s Director of Product Management, Peter Quirk, we discuss what features Sepaton brought forward from its existing S2100 product line and what new features its VirtuoSO platform introduced.
Jerome: What elements did Sepaton bring forward from its S2100 product and what new elements did it added to the Virtuoso platform?
Peter: The OptiScale™ architecture at the heart of VirtuoSO is a combination of technologies. First, it is a distributed file system running across many nodes, whose shared storage model deeply integrates deduplication technologies for both inline and post-process deduplication.
Secondly, its process management model and fine-grained instrumentation allow it to control these distributed services as if they form a single system. By using a modern sharded database, distributed across the nodes, the system is able to ride through failures of parts of the database implementing the global dictionary, whether due to hardware or software problems. The management infrastructure provides for a highly manageable and well instrumented system.
We brought forward the physical architecture from our VTL product, the S2100. Both the S2100 and VirtuoSO use shared storage and multiple ingest nodes in a very similar fashion.
The difference is that the S2100 did not really support a file system abstraction. It used a very high performance, extent-based storage system since it was storing blocks of virtual tape, whereas VirtuoSO presents a file system via its NAS protocols, which is built on top of an object store which will be exposed to applications in a later release.
In the VirtuoSO platform, Sepaton had to implement a fully distributed file system which it did from the ground up. The software stack is completely new with respect to the file system and inline deduplication. It is really based on an early project we did around a big data platform, which we did not bring to market. There are a lot of HDFS (Hadoop File System) concepts in the file system used by VirtuoSO, since backup deals in the main with very large sequential file transfers, a single-writer per stream and seldom more than one reader, which is similar to Hadoop workloads.
Sepaton implemented its own inline deduplication engine closely coupled to the file system, while the post-process deduplication engine was ported across from its VTL.
Below the VirtuoSO file system layer are several components, the most important of which are the data movers. The data mover design is source- and target-aware, and extensible to new sources and targets. Early extensions that we plan to deliver include support for a dedupe source on the client, and support for cloud targets.
At the lowest layers are the services which coordinate the nodes and storage in the cluster, provide for journaling and recovery, and support performance and health instrumentation.
With a new software stack we were able to introduce a completely new approach to managing the system through a web-based interface implementing the latest responsive techniques to support modern browsers on any desktop or mobile device. The web interface is built open REST APIs which will be exposed for partners and customers to use for integration with third-party tools and home-grown automation scripts.
One other element Sepaton did bring across in part from the VTL was its OST stack for supporting NetBackup and BackupExec. Much of that code has been layered on the file system with few changes, while the replication features like opt_dup and A.I.R. will interface to the unified replication engine in VirtuoSO.
Jerome: So it sounds like you brought forward the best of what Sepaton already had on its S2100 and then added some new features as well?
Peter: That is largely true but there is more color to it. Sepaton did have this other project going on, as it was intent on building a new product to complement its existing solution to take us into some new markets. Along the way we realized that it could actually be the foundation for a new file-based backup appliance. We had an option to combine the new and old technologies in one product, but prudently decided not to destabilize the VTL platform by grafting a lot of new features onto it to implement a file system
The software foundations of the S2100 were well-suited to a VTL design, but not for the distributed file system that this new backup appliance needed. So Sepaton leveraged its investment in hardware, and knowledge of very high-end large scale backup applications, and built another software platform using mostly the same hardware. Over time, we’ll add the VTL protocol to VirtuoSO to provide S2100 customers a way to migrate to the VirtuoSO if they need a mix of VTL, NAS and OST protocols in one system.
In part I of this interview series, we discuss how databases and virtual machines (VMs) are just beginning to take full advantage of the benefits that disk offers as a backup target.
In part III of this interview series, we discuss how Sepaton’s Virtuoso platform examines the nature of the application data being backed up and then automatically implements the best methodology to deduplicate it.