Synchronization Troubleshooting Options

These are the available troubleshooting options. Each option involves editing a global system configurable parameter to reconfigure the system with different value than the default.

Enlarging the Sending Queue

The Sending Queue on the cluster member stores locally generated sync updates. Updates in the Sending Queue are replaced by more recent updates. In a highly loaded cluster, updates are therefore kept for less time. If a member is asked to retransmit an update, it can only do so if the update is still in its Sending Queue. The default (and minimum) size of this queue is 512. Each member has one sending queue.

To enlarge the sending queue size:

Change the value of the global parameter fw_sync_sending_queue_size. See Advanced Cluster Configuration.
You must also make sure that the required queue size survives boot. See How to Configure a Kernel Parameter to Survive Reboot.

Enlarging this queue allows the member to save more updates from other members. However, be aware that each saved update consumes memory. When changing this variable you should consider carefully the memory implications. Changes will only take effect after reboot.

Enlarging the Receiving Queue

The Receiving Queue on the cluster member keeps the updates from each cluster member until it has received a complete sequence of updates. The default (and minimum) size of this queue is 256. Each member keeps a Receiving Queue for each of the peer members.

To enlarge the receiving queue size:

Change the value of the global parameter fw_sync_recv_queue_size. See Advanced Cluster Configuration.
You must also make sure that the required queue size survives boot. See How to Configure a Kernel Parameter to Survive Reboot.

Enlarging this queue means that the member can save more updates from other members. However, be aware that each saved update consumes memory. When changing this variable you should carefully consider the memory implications. Changes will only take effect after reboot.

Enlarging the Sync Timer

The sync timer performs sync related actions every fixed interval. By default, the sync timer interval is 100ms. The base time unit is 100ms (or 1 tick), which is therefore the minimum value.

To enlarge the sync timer:

Change the value of the global parameter fwha_timer_sync_res. See Advanced Cluster Configuration. The value of this variable can be changed while the system is working. A reboot is not needed.

By default, fwha_timer_sync_res has a value of 1, meaning that the sync timer operates every base time unit (every 100ms). If you configure this variable to n, the timer will be operated every n*100ms.

Enlarging the CPHA Timer

The CPHA timer performs cluster related actions every fixed interval. By default, the CPHA timer interval is 100ms. The base time unit is 100ms (or 1 tick), which is also the minimum value.

If the cluster members are geographically separated from each other, set the CPHA timer to be around 10 times the round-trip delay of the sync network.

Enlarging this value increases the time it takes to detect a failover. For example, if detecting interface failure takes 0.3 seconds, and the timer is doubled to 200ms, the time needed to detect an interface failure is doubled to 0.6 seconds.

To enlarge the CPHA timer:

Change the value of the global parameter fwha_timer_cpha_res. See Advanced Cluster Configuration. The value of this variable can be changed while the system is working. A reboot is not needed.

By default, fwha_timer_cpha_res has a value of 1, meaning that the CPHA timer operates every base time unit (every 100ms). If you configure this variable to n, the timer will be operated every n*100ms.

Reconfiguring the Acknowledgment Timeout

A cluster member deletes updates from its Sending Queue (described in Sending Queue Size) on a regular basis. This frees up space in the queue for more recent updates.

The cluster member deletes updates from this queue if it receives an ACK about the update from the peer member.

The peer member sends an ACK in one of two circumstances — on condition that the Block New Connections mechanism (described in Blocking New Connections Under Load) is active:

After receiving a certain number of updates.
If it did not send an ACK for a certain time. This is important if the sync network has a considerable line delay, which can occur if the cluster members are geographically separated from each other.

To reconfigure the timeout, after which the member sends an ACK:

Change the value of the global parameter fw_sync_ack_time_gap. See Advanced Cluster Configuration. The value of this variable can be changed while the system is working. A reboot is not needed.

The default value for this variable is 10 ticks (10 * 100ms). Thus, if a member did not send an ACK for a whole second, it will send an ACK for the updates it received.

Contact Check Point Support

If the other recommendations do not help solve the problem, contact Check Point Support for further assistance.