I arrived home from VMworld 2009 last night after spending much of the flight reflecting on what I learned, the conversations that I had and the technologies that I had a chance to view. However at every conference there is usually one technology that piques my interest and this one was no different as I had a chance to do a deeper dive into one company’s method of doing virtual machine backup while at the show. What made this technology transcend other virtual machine backup approaches is that it is by far the most scalable, easiest to implement and simplest to manage that I have yet encountered.
Before I let the cat out of the bag and describe who offers this technology and what it does, here are what I consider the principle issues with the backup of virtual machines that exist today that any current backup approach does a completely adequate job of addressing.
- Backup agents. The traditional backup software technique requires the deployment of agents that do either full or incremental backups of all of the data on the VM. The problem with this approach is that it introduces additional overhead on the underlying physical machines’s CPU and network connections.
- Disk-based Targets. The appeal of using devices is that they can speed up backups and deduplicate the backup data. The issue is that they do little or nothing to reduce the amount of data processed on the server or sent over the network.
- Deduplicating backup agents. Using this approach, data is deduplicated on the VM before it is sent to the target which reduces overhead on the physical server’s CPU and network connections. However the issue that arises here is that it still requires an agent on each VM and requires the backup software to perform the recovery.
- VCB Backups. Using VMware’s native VMware Consolidated Backup (VCB) feature, a snapshot is taken of each VM which can then be backed up. This moves the load associated with the backup off of the host but requires organizations to create proxy servers that can mount and backup these snapshots plus they may need to use external storage so the proxy server can mount these snapshots. Further, the scalability of this solution becomes questionable. Backups of 10 or 20 snapshots in a backup window is usually not a problem. Try to backup hundreds and thousands of VM snapshots using a proxy server and the situation becomes untenable.
- Agentless backups. Agentless backup was, until yesterday, in my mind the best approach that I had encountered to deliver on the backup of virtual machines. It discovered VMs by communicating with either the VMware ESX server or vCenter, getting a list of VMs on each VMware physical host and then backing up each individual VM. The two main drawbacks that I saw were that it required the backup software to do the recovery and, as the amount of data to backup grew, it could scale but the backend storage system which executed the backups and kept all of the data tended to become more complex to manage.
The new technology that I saw takes many of these current issues associated with virtual machine backups off of the table. The company and product to which I am specifically referring is PHD Virtual and its esXpress backup software. While I have briefly covered esXpress in a past blog when I wrote about its inclusion on Quantum’s DXi7500 systems, I did not know much about the history of esXpress or the details of how it worked. But as luck would have it, PHD Virtual was at VMworld so I connected with them just before I left yesterday and am glad I did.
Here are the two specific features of esXpress that caught my eye:
- Creates virtual appliances. This is one of its two notable features that caught my eye. It creates virtual appliances on each physical VMware ESX or vSphere host that then does the backups. This is significant in at least six ways that are enumerated below. (While there are more I, for the sake of space, opted not to go into them.)
- First, it does not require agents on each individual VM so all that is required to deploy it is the creation of a new virtual appliance on the ESX physical host. According to the PHD engineer I spoke to, creating a virtual appliance can be done in as little as 5 minutes and in less than an hour even if done by a novice.
- Second, it can still do block and/or file level backups of each VM on that host. This is now done without the need to install an agent on each VM or a restart of individual VMs.
- Third, there is no dependency on VCB. This removes the requirement to use external storage on which VM snapshots reside as well as the need to create proxy servers that would mount the snapshots and backup this data.
- Fourth, as it does each backup, it deduplicates and compresses the data. This allows organizations to realistically achieve 20:1 or greater deduplication rates plus it minimizes the CPU and network overhead on the underlying physical server since backups complete faster and there is less data to transmit over the network.
- Fifth, the virtual appliance is only active while it is performing backups. Using VMware’s scheduler, the virtual appliance is turned on when it is time for backups to begin. Once the backups of the VMs on that physical host are complete, it shuts down until it is the time for backups to begin again so server resources are not consumed during the day.
- Sixth, it scales. Whether you are running one physical ESX server or a thousand, it scales because it only backs up the data on the server on which it is running. It sends this data to another server which holds it for recovery. Organizations can then even optionally replicate this directory to another server for DR purposes and, if I understood them correctly, any server can serve as a backup target for any other server.
- Requires no backup software to do the recovery. This was the other major feature that caught my eye. As it backs up data, the data is stored in such a format that it can immediately be presented to another physical server and run as a VM with minimal or no recovery time without requiring someone to first interface with the backup software to recover it.
The way esXpress is architected represents a significant step forward in how virtual machine backups are done. It addresses nearly all of the technical and management concerns currently associated with the backup and recovery of virtual machines and can be swiftly implemented in either small business environments or in the largest enterprises with minimal disruption or training.
While I still have a few questions as to how it handles massively large deduplicated data stores and how it delivers consistent images for the recovery of database applications, I suspect that it offers workarounds to these issues. I simply ran out of time to ask them about them since I was needed to leave to catch my flight home.
Here’s the bottom line about this technology – WOW! In my opinion, every backup vendor is going to have to adopt this method for the backup of virtual machines because it is so powerful. I fully expect that this company will be acquired very soon because it is a quick fix to VM backups at either the small business or the enterprise level. (And it did not hurt that their personnel on site said that they fully expected to be acquired soon because a number of larger vendors interested in their technology were sniffing
around their booth.) But regardless of whether or not PHD is acquired, expect every major backup vendor to come up with their own variation of what esXpress does because they will soon be at a competitive disadvantage in VMware environments if they fail to deliver a similar offering.