|By Josh Mazgelis||
|December 12, 2013 09:30 AM EST||
Many people have turned to virtualization for, among other reasons, the increased ease of disaster recovery. Because virtual machines are no more than files on a disk connected to standardized hypervisor hardware, those virtual machines can be shared, copied, moved, and recovered almost anywhere. Whether you’re looking at clustered hypervisors with VMware HA/FT, remote datacenters with VMware SRM, or the plenitude of cloud IaaS companies willing to spin up your backups in the event of a failure, all of these solutions provide what I refer to as “hardware availability”
If there’s a problem with a physical server running the hypervisor, machines can be restarted on an adjacent machine. If there’s a bigger issue like a power outage or a site failure, machines can be restarted in another datacenter or in the cloud. No matter what the hardware problem is, the machine itself can just be run somewhere else.
But what happens if the virtual machine, the operating system running in it, or even the application itself, is the point of the failure? Copying that virtual machine file to another hypervisor isn’t going to solve the problem. The copy of the VM is going to have the same issue as the original no matter where it is run from. When this happens, you now need to choose between rolling back to an earlier copy prior to the failure or accepting the downtime that it takes to resolve the issue at hand. For business-critical applications, neither of these is a desirable option.
While hardware availability is a great boon for IT continuity planners, those same people need to recognize that one size does not fit all. If a given business service absolutely positively needs to stay online, it’s important to look for ways to build in resilience to failures and provide more application availability.