All Host or All Application Failure
If all hosts on the P-VOL side are disabled or if the application cannot start successfully on any P-VOL hosts, but both arrays are operational, the service group fails over.
In replicated data cluster environments, the failover can be automatic, whereas in global cluster environments, failover by default requires user confirmation. Multiple service groups can fail over in parallel; TrueCopy does not provide any serialization restrictions on simultaneous device group failover.
However, since the horctakeover command makes an attempt to contact the RAID manager on the original P-VOL when performing a failover, if the RAID manager is inaccessible, failover will be delayed until the surviving RAID manager's connect timeout expires. This timeout is defined in the configuration file for the particular instance.
Total Site Disaster
In a total site failure, all hosts and the Hitachi array are completely disabled, either temporarily or permanently.
In a replicated data cluster, site failure is detected the same way as a total host failure, that is, the loss of all LLT heartbeats.
In a global cluster environment, VCS detects the failure by the loss Icmp heartbeat between the clusters.
If a failover occurs, the online entry point of the TrueCopy agent runs the horctakeover command; the failover may be delayed because the RAID manager waits for the timeout in trying to contact its peer RAID manager daemon before taking over the disks. This timeout is defined in the device group's instance's configuration file. Make sure the value of the OnlineTimeout entry point of the HTC type is greater than the RAID manager timeout.
The online entry point detects whether any synchronization was in progress when the source array was lost. Since the target TrueCopy devices are inconsistent until the synchronization completes, the agent does not write-enable the devices, but it times out and faults. You must restore consistent data from a ShadowImage or tape backup.
|