Storing archival and backup data in the cloud is high on the list of priorities of many organizations if for no other reason is that the data remains accessible and available without organizations having to bear the burden of managing the data locally long term. But as more organizations use cloud storage gateways to store this data, they will find distinct differences in how these appliances manage data in the cloud with differences sometimes existing even between appliances from the same vendor. In this fourth part of my interview series with BridgeSTOR’s CEO John Matze, he reveals the various methods that the BridgeSTOR NAS and VTL cloud gateway appliances store, access and manage this data locally and in the cloud.
Jerome: It appears that in the VTL cloud gateway space BridgeStor does not have a lot of competition. Is there currently a bunch of low hanging fruit (i.e. sales opportunities) just waiting for you to pick up?
John: That is correct. There are a few startups but because BridgeSTOR is storing data in the cloud in such a unique way, we believe we have a strong competitive advantage. BridgeSTOR is very open to how it stores data in the cloud as we gives organizations a lot of flexibility in terms of what clouds it may store data to include cloud providers such as Amplidata.
Further, we let people know how we store the data in the cloud. We have a document that we give customers that just says, “Hey here’s how it’s stored up there.” We also have command line utilities. If you want to get a single file back, all you have to is type in the name of the file in a command line and it will bring it back without even having to go through the cloud gateway.
One of the other features we offer is a device drive runs on Redhat Linux. You can literally mount the cloud provider and go directly to its object store – again without needing to go through the gateway.
This approach delivers really high performance as organizations do not have to use CIFS or NFS. This is how BridgeSTOR differentiates itself. But by coming to market now, it got to learn from all the other gateways on what they did and did not do right.
Jerome: In what form factors (physical or virtual) are your VTL cloud gateway available? If available as a virtual appliance, what hypervisor platforms does it run on?
John: We developed it for Hyper-V but it also runs on VirtualBox. We also have it running on a Linux KVM and I assume VMware would run it just fine. There is nothing that we are doing that is specific for any hypervisor other than Hyper-V which requires specific Linux drivers to run in that environment.
Jerome: So the underlying the virtual appliance platform is based on Linux?
John: Yes. It’s pretty basic. But on the other side of the coin we can recompile the code and that is how we can put it on entry level NAS solutions like a QNAP box. It is running its own version of Linux so we recompile our code for its environment so QNAP can natively install it on their box without a VM at all.
Jerome: How much storage capacity does your solution require on the NAS appliances to serve as a local disk cache?
John: We do not have a specific capacity requirement at this point. We can manage up to two (2) TBs though minimally we would want it to be a 1:1 ratio with the same amount of local disk cache as what is in the cloud. But when you get to the high end size, it is not going to work that way.
We do have a cache policy with an 80/60 rule. Once the local cache hits 80 percent full we will delete the oldest files down to 60 percent though so you set the cache at whatever you want. However the larger the local disk cache is, the fewer download charges you have for Amazon so in that sense it makes it beneficial to put a lot of cache in there.
Jerome: Essentially what is happening when the backup occurs or when data is sent to either the NFS/CIFS target or the VTL target, the data will land on the cache on your appliance (physical or virtual.) Your appliance will then immediately start replicating out to the Amazon cloud. You will keep it on the local cache until it hits those specified thresholds and then you will start deleting it down. People can put as little or as much cache as the appliance. Yes?
John: The only difference is on the VTL products we will not carry any cache at all. That will just go straight to Amazon. The metadata will be cached, not the physical data. But on our NAS product, yes, you are correct, it will write to both spots. We will write to the local cache and we will write to Amazon at the same time.
Then over time if those files age out we will delete them out of cache. In this manner you will only have the newest data in cache though you will see the entire view of what data is in the repository since a copy of the metadata is always kept on both the local cache and in the cloud
What is really cool about our solution is that it includes intelligence as well. Using its global file system, if you are in New York and another guy is in Chicago, the appliances share and access the same metadata. So if a guy updates a file in Chicago, and then the one in New York tries to access it, the appliance automatically sees that the cache in New York is out of date so it deletes the local one, brings the other one down from the cloud, and lets the individual in New York edit.
In part I of this interview series we take a look at some of the different gateways solutionsavailable for accessing public storage clouds and how they differ.
In part II of this interview series we discuss the inner workings of the VTL interface that BridgeSTOR is making available on its cloud gateway appliance.
In part III of my interview series we discuss how using the BridgeSTOR VTL cloud gateway appliance organizations can move their tape museums into the cloud.