Print Download PDF Send Feedback

Previous

Next

Synchronization Troubleshooting Options

These are the available troubleshooting options. Each option involves editing a global system configurable parameter to reconfigure the system with different value than the default.

Increasing the Size of the Sending Queue

The Sending Queue on the Cluster Member stores locally generated Delta Sync updates. Updates in the Sending Queue are replaced by more recent updates. In a highly loaded cluster, updates are therefore kept for less time. If a Cluster Member is asked to retransmit an update, it can only do so if the update is still in its Sending Queue. The default (and minimum) size of this queue is 512. Each Cluster Member has one sending queue.

To increase the size of the Sending Queue:

  1. Change the value of the global kernel parameter fw_sync_sending_queue_size. See Advanced Cluster Configuration. You can change the value of this kernel parameter only permanently. Change takes effect only after reboot.
  2. You must also make sure that the required queue size survives boot. See How to Configure a Kernel Parameter to Survive Reboot.

Enlarging this queue allows the Cluster Member to save more updates from other Cluster Members. However, be aware that each saved Delta Sync update consumes memory. When changing value of this kernel parameter, you should consider carefully the memory implications.

Increasing the Size of the Receiving Queue

The Receiving Queue on the Cluster Member keeps the updates from each Cluster Member until it has received a complete sequence of updates. The default (and minimum) size of this queue is 256. Each Cluster Member keeps a Receiving Queue for each of the peer Cluster Members.

To increase the size of the Receiving Queue:

  1. Change the value of the global kernel parameter fw_sync_recv_queue_size. See Advanced Cluster Configuration. You can change the value of this kernel parameter only permanently. Change takes effect only after reboot.
  2. You must also make sure that the required queue size survives boot. See How to Configure a Kernel Parameter to Survive Reboot.

Enlarging this queue means that the Cluster Member can save more updates from other Cluster Members. However, be aware that each saved Delta Sync update consumes memory. When changing the value of this kernel parameter, you should carefully consider the memory implications.

Increasing the Sync Timer

The Sync Timer performs Delta Sync related actions every fixed interval. By default, the Sync Timer interval is 100ms. The base time unit is 100ms (or 1 tick), which is therefore the minimal value.

To increase the Sync Timer:

Change the value of the global kernel parameter fwha_timer_sync_res. See Advanced Cluster Configuration. You can change the value of this kernel parameter on-the-fly - while the system is working. A reboot is not needed.

To make sure that the new Sync Timer interval value survives boot, see How to Configure a Kernel Parameter to Survive Reboot.

By default, fwha_timer_sync_res has a value of 1, meaning that the Sync Timer operates every base time unit (every 100ms). If you configure the value of this kernel parameter to N, the Sync Timer operates every N*100ms.

Increasing the CPHA Timer

The CPHA Timer performs cluster related actions every fixed interval. By default, the CPHA Timer interval is 100ms. The base time unit is 100ms (or 1 tick), which is also the minimum value.

If the Cluster Members are geographically separated from each other, set the CPHA Timer to be around 10 times the round-trip delay of the sync network.

Enlarging the CPHA Timer interface increases the time it takes to detect a failover. For example, if detecting interface failure takes 0.3 seconds, and you double the CPHA Timer interval to 200ms, then the time needed to detect an interface failure is doubled to 0.6 seconds.

To increase the CPHA Timer:

Change the value of the kernel parameter fwha_timer_cpha_res. See Advanced Cluster Configuration. You can change the value of this kernel parameter on-the-fly - while the system is working. A reboot is not needed.

To make sure that the new CPHA Timer interval value survives boot, see How to Configure a Kernel Parameter to Survive Reboot.

By default, fwha_timer_cpha_res has a value of 1, meaning that the CPHA Timer operates every base time unit (every 100ms). If you configure the value of this kernel parameter to N, the CPHA Timer operates every N*100ms.

Reconfiguring the Acknowledgment Timeout

A Cluster Member deletes updates from its Sending Queue on a regular basis. This frees up space in the queue for more recent updates.

The Cluster Member deletes updates from this queue if it receives an ACK about the update from the peer member.

The peer Cluster Member sends an ACK in one of two circumstances - on condition that the Block New Connections mechanism (described in Blocking New Connections Under Load) is active:

To reconfigure the timeout, after which the member sends an ACK:

Change the value of the global kernel parameter fw_sync_ack_time_gap. See Advanced Cluster Configuration. You can change the value of this kernel parameter on-the-fly - while the system is working. A reboot is not needed.

To make sure that the new configured value survives boot, see How to Configure a Kernel Parameter to Survive Reboot.

The default value for this variable is 10 ticks (10 * 100ms). Thus, if a member did not send an ACK for a whole second, it will send an ACK for the updates it received.