This section lists the ClusterXL error messages.
Included Topics |
This log message can happen if the working mode of the cluster members is not the same, for example, if one member is running High Availability, and another Load Sharing Multicast or Unicast mode. In this case, the internal ClusterXL mechanism tries to synchronize the configuration of the cluster members, by changing the working mode to the lowest common mode. The order of priority of the working modes (highest to lowest) is: 1. Synchronization only 2. Load Sharing 3. High Availability (Active Up) 4. High Availability (Primary Up).
This log message can occur during policy installation on the cluster. It means that a serious configuration problem exists in that cluster. Probably some other cluster has been configured with identical parameters and both of them have common networks.
This is caused when the member that printed this message stops hearing certain types of messages from member X. Verify that cphaprob state shows all members as active and that fw ctl pstat shows that sync is configured correctly and working properly on all members. In such a case it is fair to assume that there was a temporary connectivity problem that was fixed in the meantime. There may be several connections that may suffer from connectivity problems due to that temporary synchronization problem between the two members. On the other hand, this can indicate that the other member is really down.
A member of the same cluster as the reporting member has more than three virtual IP addresses defined on the same interface. This is not a supported configuration and will harm ClusterXL functionality.
This is a license error message: If you have a basic Security Gateway license then sync is also licensed. Check the basic Security Gateway license using cplic print and cplic check.
Several problems of this sort can happen during a full sync session when there are connections that are opened and closed during the full sync process. Full sync is automatic as far as possible, but it is not fully automatic for reasons of performance, A Security Gateway continues to process traffic even when it is serving as a full sync server. This can cause some insignificant problems, such as a connection that is being deleted twice, a link to an existing link, and so forth. It should not affect connectivity or cause security issues.
Cluster in not synchronized. Usually happens in OPSEC certified third-party Load Sharing products for which Support non-sticky connections is unchecked in the cluster object 3rd Party Configuration page.
The critical device (also known as Problem Notification, or pnote) mechanism can only store up to 16 different devices. An attempt to configure the 17th device (either by editing the cphaprob.conf file or by using the cphaprob -d ... register command) will result in this message.
Each device registered with the pnote mechanism must have a unique name. This message may happen when registering new pnote device, and means that the device <NAME> is already registered as with pnote number <NUMBER>.
Indicates an attempt to unregister a device which is not currently registered.
A log indicating that there is a different policy id between the two or more members was not sent. Verify all cluster members have the same policy (using fw stat). It is recommended to re-install the policy.
This message can be received when ClusterXL hears CCP packets of clusters of version 4.1. In that case it can be safely ignored.
The following error messages can appear in SmartView Tracker Active mode. These errors indicate that some entries may not have been successfully processed, which may lead to missing synchronization information on a cluster member and inaccurate reports in SmartView Tracker.
Indicates a configuration problem on a clustered member. Either synchronization is misconfigured, or there is a problem with transmitting packets on the sync interface. To get more information on the source of the problem
Indicates that a clustered member has dropped SmartView Tracker Active mode updates in order to maintain sync functionality.
This message appears when the local member receives a retransmission request for a sequence number which in no longer in its sending window. This message can indicate a sync problem if the sending member didn't receive the requested sequence.
These messages may appear only during full sync. While performing full sync the delta sync updates are being saved and are applied only after the full sync process has finished. It is possible to limit the memory used for saving delta sync updates by setting the fw_sync_max_saved_buf_mem variable to this limit.
This message may appear due to high load resulting in the sync buffer being filled faster than it is being read.
This message may appear due to a problem starting the full sync process, and indicates a severe problem. Contact Technical Support.
This message could appear under extremely high load, when a synchronization update was permanently lost. A synchronization update is considered to be permanently lost when it cannot be retransmitted because it is no longer in the transmit queue of the update originator. This scenario does not mean that the Security Gateway will malfunction, but rather that there is a potential problem. The potential problem is harmless if the lost sync update was to a connection that runs only on a single member as in the case of unencrypted (clear) connections (except in the case of a failover when the other member needs this update).
The potential problem can be harmful when the lost sync update refers to a connection that is non-sticky (see Non-Sticky Connections), as is the case with encrypted connections. In this case the other cluster member(s) may start dropping packets relating to this connection, usually with a TCP out of state error message (see TCP Out-of-State Error Messages). In this case it is important to block new connections under high load, as explained in Blocking New Connections Under Load.
The following error message is related to this one.
These messages appear when there was a temporary sync problem and some of the sync updates were not synchronized between the members. As a result some of the connections might not survive a failover.
The previous error message is related to this one.
Previous versions used a kernel table called non_sync_ports to implement selective sync, which is a method of choosing services that don't need to be synchronized. Selective sync can now be configured from SmartDashboard. See Choosing Services That Do Not Require Synchronization.
When the synchronization mechanism is under load, TCP packet out-of-state error messages may appear in the Information column of SmartView Tracker. This section explains how to resolve each error.
These messages occur when a FIN packet is retransmitted after deleting the connection from the connection table. To solve the problem, in SmartDashboard Global properties for Stateful Inspection, enlarge the TCP end timeout from 20 seconds to 60 seconds. If necessary, also enlarge the connection table so it won't fill completely.
This message occurs when a SYN is received on an established connection, and the sequence verifier is turned off. The sequence verifier is turned off for a non-sticky connection in a cluster (or in SecureXL). Some applications close connections with a RST packet (in order to reuse ports). To solve the problem, enable this behavior to specific ports or to all ports. For example, run the command:
fw ctl set int fw_trust_rst_on_port <port>
Which means that the Security Gateway should trust a RST coming from every port, in case a single port is not enough.
These messages mean that automatic proxy ARP entries for static NAT configuration might not be properly installed.
These messages mean that an illegal CPHA packet was received and will be dropped. If this happens more than few times during boot, the cluster malfunctions.
A notification that the operation fw ctl set int fwha_magic_mac succeeded.
A notification that the operation fw ctl set int fwha_magic_mac failed. Previous MAC values will be retained.
These messages mean that an internal error in registration to the IPSO clustering mechanism has occurred. Verify that the IPSO version is supported by this the Security Gateway version and that the IPSO IP Clustering or VRRP cluster is configured properly.
A notification that should be normally received during Security Gateway initialization and removal.
These messages may appear as a result of a problem in the interaction between the IPSO and ClusterXL device monitoring mechanisms. A reboot should solve this problem. Should this problem repeat itself contact Check Point Technical support.
If a reboot (or cpstop followed by cpstart) is performed on a cluster member while the cluster is under severe load, the member may fail to start correctly. The starting member will attempt to perform a full sync with the existing active member(s) and may in the process use up all its resources and available memory. This can lead to unexpected behavior.
To overcome this problem, define the maximum amount of memory that the member may use when starting up for synchronizing its connections with the active member. By default this amount is not limited. Estimate the amount of memory required as follows:
|
New connections/second |
|||||
Number of open |
100 |
1000 |
5000 |
10,000 |
||
1000 |
1.1 |
6.9 |
|
|
||
10000 |
11 |
69 |
329 |
|
||
20000 |
21 |
138 |
657 |
1305 |
||
50000 |
53 |
345 |
1642 |
3264 |
||
Note - These figures were derived for cluster members using the Windows platform, with Pentium 4 processors running at 2.4 GHz. |
For example, if the cluster holds 10,000 connections, and the connection rate is 1000 connections/sec you will need 69 MB for full sync.
Define the maximum amount of memory using the Security Gateway global parameter: fw_sync_max_saved_buf_mem.
The units are in megabytes. For details, see Advanced Cluster Configuration.