Part: 09 Basics about Recovery Procedures in Zerto
Zerto provides various options to transfer VMs from Protected site to Recovery site. Let’s discuss all of them in this post
VMs are moved to recovery site
VMs can be re-protected by reversing the direction of replication
Move operation is generally done when protected site needs some maintenance done. More technically it is called Planned migration. In planned migration it is assumed protected and recovery sites are operational.
Move differs from failover. In move operations VMs are automatically shutdown. VMs cannot be restored to any checkpoint. They are restored only to last known checkpoint. Final checkpoint is taken to ensure data integrity
You can initiate move either from protected site or recovery site UI
After checking VMs on the recovery site, you can commit which moves VMs to Protected site or you can rollback which deletes VMs at recovery site and power ON the machines at the protected site.
If reverse replication is not configured, there is no protection for the recovered machines.
Uncommitted Move Operations
Changes made while VMs in recovery site are not saved
Difference between Failover Test Operation (described below) and Uncommitted move operations is that Failover test operation boots VM on test network, while uncommitted move operations boots VM on production network. This allows end user to do end to end testing of their application
All changes are saved in scratch volumes to enable roll back. This operation can continue till scratch volume is full. Scratch volume size is determined by journal size hard limit (by default unlimited) and journal history (by default 4 hours)
It is always invoked after DR has occurred or one has to simulate DR (i.e. breaking link between the sites).
Failover operation assumes connectivity between the sites is broken BUT VMs and Disks are not removed.
VMs must be manually shutdown to avoid two instance running at the same time
In Failover operation you can always select from the list of checkpoints available
If both the ZVMs are reachable. Three options are available
Do nothing i.e. VMs on protected sites are not touched
VMs can be shutdown gracefully, if VMs do not shutdown operation is aborted
VMs are forcefully shutdown and operation continues
VMs are created at recovery site, VMDKs are attached, DR network is attached. VMs are powered ON as per the Boot order is followed.
The default is to automatically commit the failover operation without testing. Optionally you can change the commit or rollback option. In commit operation failover is finalized and in rollback back, aborting the operation
After failover operation is completed i.e. After application and business owners confirm services are running and original protected site is up you can use move option to move recovered VMs back again.
Failover Test Operations
Failover test operation creates test virtual machines in a sandbox
It uses test network specified in VPG definition
It restores VM to a specified point of time using scratch volumes (managed by VRAs). Scratch volumes are thin provisioned vdisks, one per VM in the VPG
Production VMs are not impacted during failover test operations. Since checkpoints continue to be generated, they continue to replicate.
Failover test operations is time limited. Time is based on size of journal history available at the recovery site and journal size hard limit (by default unlimited) configured
Following things can be tested using failover Test operation without impacting any production servers
VMs are replicating
VMs can be powered ON
Guest OS inside VM can boot
VMs get IPs as per policy defined in Test network configuration
VMs can be restored any checkpoints
Clone operations copies VM across the WAN
Use cases: You wish to save VM to specific point in time (especially when you want to restore to specific point without having any dependence on checkpoints)
Clones machines are not powered on
Cloned machines names are appended with time stamp of checkpoint’s used.
Cloned VMs are standalone VMs i.e. they are not paired with original VM for any kind of replication. Any changes made to the protected VM after clone operations are not updated with its cloned counter part
Below table explains various procedure and it’s impact
Move operation has huge impact to the environment. Must have CEO level approval.
Business critical situations. Entire operations is down. Need CIO/CEO level approval to start recovery
Medium impact especially to network traffic between the site. Need at least to be informed to CIO/CEO and approval from Business head
NB:I have not included Failover Test operation as it has Zero impact
I will explain each procedure in detailed in following posts
If you wish to follow entire series of Zerto go to the Landing Page