Print Download PDF Send Feedback

Previous

Next

Cluster Failover

What is Failover?

Failover is a cluster redundancy operation that automatically occurs if a Cluster Member is not functional. When this occurs, other Cluster Members take over for the failed Cluster Member.

In a High Availability mode:

In a Load Sharing mode:

To tell each Cluster Member that the other Cluster Members are alive and functioning, the ClusterXL Cluster Control Protocol (CCP) maintains a heartbeat between Cluster Members. If after a predefined time, no CCP packets are received from a Cluster Member, it is assumed that the Cluster Member is down. As a result, cluster failover can occur.

Note that more than one Cluster Member may encounter a problem that will result in a cluster failover event. In cases where all Cluster Members encounter such problems, ClusterXL will try to choose a single Cluster Member to continue operating. The state of the chosen member will be reported as Active Attention. This situation lasts until another Cluster Member fully recovers. For example, if a cross cable connecting the sync interfaces on Cluster Members malfunctions, both Cluster Members will detect an interface problem. One of them will change to the Down state, and the other to Active Attention state.

When Does a Failover Occur?

A failover takes place when one of the following occurs in a cluster:

For more on failovers, see sk62570.

What Happens When a Cluster Member Recovers?

In a High Availability mode:

In a Load Sharing mode:

How a Recovered Cluster Member Obtains the Security Policy

The Administrator installs the Security Policy on the cluster object, rather than separately on individual Cluster Members. The policy is automatically installed on all Cluster Members. The policy is sent to the IP addresses defined in the General Properties page of the cluster member object.

When a failed cluster member recovers, first it tries to fetch a policy from one of the peer Active Cluster Members. The assumption is that the other Cluster Members have a more up to date policy. If fetching a policy from peer cluster member fails, the recovered cluster member compares its own local policy to the policy on its Management Server. If the policy on the Management Server is more up to date than the one on the recovered cluster member, the policy is fetched from the Management Server. If the cluster member does not have a local policy, it retrieves one from the Management Server. This ensures that all Cluster Members use the same policy at any given moment.