What It Is: VMware Fault Tolerance (FT) protects a virtual machine in a VMware HA cluster. VMware FT creates a secondary copy of a virtual machine and migrates that copy onto another host in the cluster. VMware vLockstep technology ensures that the secondary virtual machine is always running in lockstep synchronization to the primary virtual machine. When the host of a primary virtual machine fails, the secondary virtual machine immediately resumes the workload with zero downtime and zero loss of data.
Use Case: On Demand Fault Tolerance for Mission-Critical Applications.
VMware FT can be turned on or off on a per-virtual machine basis to protect your mission-critical applications. During critical times in your datacenter, such as the last three days of the quarter when any outage can be disastrous, VMware FT on-demand can protect virtual machines for the critical 72 or 96 hours when protection is vital. When the critical periods end FT is turned off again for those virtual machines. Turning on and off FT can be automated by scheduling the task for certain times. Refer to Figure. below showing a server failure while running two virtual machines protected by VMware HA and a third virtual machine protected by FT.
The HA-protected virtual machines are restarted on the other host while the FT-protected virtual machine immediately fails over to its secondary and experiences no downtime and no interruption.
Step 1: Turn on VMware Fault Tolerance for a virtual machine
Once your cluster is enabled with VMware HA, you can protect any virtual machine with VMware FT, given that the following prerequisites are met:
1. The ESX host must have an FT-enabled CPU. For details please refer to http://kb.vmware.com/kb/1008027.
2. Hosts must be running the same build of ESX.
3. Hosts must be connected via a dedicated FT logging NIC of at least 1 Gbps.
4. Virtual machine being protected must have a single vCPU.
5. Virtual machine’s virtual disk must be thick provisioned.
2. To enable a virtual machine with VMware FT, right-click the virtual machine called Win2003_VM01 on esx05a, select Fault Tolerance, and click Turn On Fault Tolerance. Please note that you will need cluster administrator permissions to enable VMware FT.
Step 2: Convert virtual disks to thick-provisioned virtual disk
VMware FT requires the virtual machine’s virtual disk to be thick provisioned. Thin-provisioned virtual disks can be converted to thick-provisioned during this step.
1. A dialog box will appear indicating that virtual machines must use thick-provisioned virtual disks. Click Yes to convert to thick-provisioned virtual disks and continue with turning on VMware FT.
Step 3: Observe the following actions after turning on VMware FT The process of turning on FT for the virtual machine has begun and the following steps will be executed:
1. The virtual machine, Win2003_VM01, is designated as the primary virtual machine.
2. A copy of Win2003_VM01 is created and designated as the secondary machine.
3. The secondary virtual machine is migrated to another ESX host in the cluster, esx05b in this case. VMware DRS is used to determine what host the secondary virtual machine is migrated to when FT is turned on. For subsequent failovers, a host for the new secondary virtual machine is chosen by VMware HA. Win2003_VM01 is now labeled as Protected under Fault Tolerance Status.
Step 5: Observe vSphere Alarms after Host Failure
Certain alarms are built into VMware vSphere to signal failures in ESX hosts as well as virtual machines. During the host failure invoked above, you can see an alarm for the FT-protected virtual machine.
1. Click the Alarms tab for Win2003_VM01. Here an alarm is generated even though the virtual machine’s workload continues to run uninterrupted because of VMware FT.
Click the Alarms tab for the rebooted ESX host, esx05a, to see the change in the host connection and power state.