Upgrading ClusterXL Deployments
Planning a Cluster Upgrade
When upgrading ClusterXL, the following options are available to you:
- Minimal Effort Upgrade: Select this option if you have a period of time during which network downtime is allowed. The minimal effort method is much simpler because the clusters are upgraded as gateways and therefore can be upgraded as individual gateways.
- Zero Downtime: Select this option if network activity is required during the upgrade process. The zero downtime method assures both inbound and outbound network connectivity at all time during the upgrade. There is always at least one active member that handles traffic.
|
Note - During the upgrade procedure, standby members are upgraded first. When upgrade on the final active member begins, the active member fails over to the standby member (or members, depending on the deployment: High Availability or Load Sharing). At this point, since connection tables between cluster members are not synced, all open connections are lost. Only a full connectivity upgrade (between minor versions) preserves open connections.
|
- Full Connectivity Upgrade: Choose this option if your gateway needs to remain active and all open connections must be maintained. There is always at least one active member that handles traffic and open connections are maintained during the upgrade.
Permanent Kernel Global Variables
When upgrading each cluster member, verify that changes to permanent kernel global variables are not lost (see: sk26202 ). For example, if fwha_mac_magic and fwha_mac_forward_magic were set to values other than the default values, then verify these values remain unchanged after the upgrade.
Ready State During Cluster Upgrade/Rollback Operations
When cluster members of different versions are present on the same synchronization network, cluster members of the previous version become active while cluster members of the new (upgraded) version remain in a special state called Ready. In this state, the cluster members with the new version do not process any traffic destined for the cluster IP address. This behavior is the expected behavior during the upgrade process.
To avoid such behavior during an upgrade or rollback, physically or using ifconfig , disconnect the cluster interfaces and the synchronization network of that cluster member before beginning.
Upgrading 32/64-bit Cluster Members
High Availability cluster deployments support 32/64-bit configurations. A cluster that contains 32-bit and 64-bit members, the 64-bit member changes to Ready state and does not synchronize with other members. When you are upgrading or replacing cluster members, make sure that all the cluster members are configured to the same version (32-bit or 64-bit).
Upgrading OPSEC Certified Cluster Products
- When upgrading IP appliance clusters (VRRP and IP Clusters), use the Zero Downtime or the Minimal Effort procedure.
- When upgrading third-party clustering products, use the Minimal Effort procedure.
- If the third party vendor has an alternative for a zero downtime upgrade, refer to their documentation before upgrading.
Minimal Effort Upgrade on a ClusterXL Cluster
If you choose to perform a Minimal Effort Upgrade, meaning you can afford to have a period of time during which network downtime is allowed, each cluster member is treated as an individual gateway. In other words, each cluster member can be upgraded in the same way as you would upgrade an individual gateway member. For additional instructions, refer to Upgrading a Distributed Deployment.
Zero Downtime Upgrade on a ClusterXL Cluster
This section includes the procedure doing a zero downtime upgrade. Zero Downtime is supported on all modes of ClusterXL, including IPSO's IP clustering and VRRP. For additional third-party clustering solutions, consult your third-party solution guide.
To perform a zero downtime upgrade, first upgrade all but one of the cluster members.
We recommend that you do not install a new policy on the cluster until the last member is upgraded. If you must do this, see Installing a Policy during Cluster Upgrade.
To upgrade all but one of the cluster members:
- To avoid possible problems with switches around the cluster, it is recommended to switch the CCP protocol to Broadcast mode on all cluster members. Run
cphaconf set_ccp broadcast on all cluster members.
|
Note - cphaconf set_ccp starts working immediately. It does not require a reboot, and it will survive the reboot. If you want to switch the CCP protocol back to Multicast mode on all cluster members after the upgrade, then run cphaconf set_ccp multicast on all cluster members.
|
- Assume cluster member A is the active member, and members B and C are standby members.
- In Load Sharing mode, randomly choose one of the cluster members to upgrade last.
- Make sure that the previously upgraded software blade licenses are attached to members B and C.
- Attach the previously upgraded licenses to all cluster members (A, B and C) as follows:
- On the SmartConsole GUI machine, open SmartUpdate, and connect to the Security Management server. The updated licenses are displayed as Assigned.
- Use the Attach assigned licenses option to Attach the Assigned licenses to the cluster members.
- Upgrade cluster members B and C in one of the following ways:
- Using SmartUpdate
- In Place
When the upgrade of B and C is complete, reboot them.
- In SmartDashboard:
- From the Install Policy window, clear the For Gateway Clusters, install on all the members, if it fails do not install at all option located under the Install on each selected Module independently option.
- In the Gateway Cluster General Properties window, change the Cluster version to the new version.
- Install the security policy on the cluster.
The policy successfully installs on cluster members B and C. Policy install fails on member A and generates a warning. The warning can be safely ignored.
- Using the
cphaprob stat command (executed on a cluster member), verify that the status of cluster member A is Active or Active Attention. The remaining cluster members will have a Ready status. The status Active Attention is given if member A's synchronization interface reports that its outbound status is down, because it is no longer communicating with other cluster members. - Upgrade Cluster member A by:
- Using SmartUpdate
- In Place
During the upgrade, cpstop runs automatically, causing A to fail over to members B and/ or C depending on whether this is a Load Sharing or High Availability configuration.
- Reboot cluster member A.
- Run
cphaconf set_ccp multicast on all cluster members. This returns the cluster control protocol to multicast (instead of broadcast).This step can be skipped if you prefer to remain working with the cluster control protocol in the broadcast mode.
Zero Downtime Upgrade of SecurePlatform ClusterXL to Gaia ClusterXL
In this procedure, the gateway cluster has an active member (A), and two backup members (B and C). First upgrade B and C, and then upgrade A.
To do a zero down-time upgrade of a ClusterXL gateway cluster:
- Upgrade the backup members (B and C). See Upgrading an Open Server from SecurePlatform to Gaia or Upgrading an Appliance from SecurePlatform to Gaia.
- Verify that active member (A) is Active, and that B and C are Ready: On each member, run the command
cphaprob stat . - Transfer traffic to members B and C by stopping traffic on A. On A, run
cphastop - Upgrade member A, as above.
- Install the policy on A.
Converting a Security Gateway Cluster to VSX
Use the VSX Gateway Conversion wizard in SmartDashboard to convert a Gaia High Availability cluster of Security Gateways to a VSX cluster. The settings of each Security Gateway are applied to the VSX Gateway (VS0). For more about using the Conversion wizard, see sk79260.
You can only convert a cluster that uses the Gaia operating system.
|
Important - There is no loss of connectivity during the conversion process. You cannot use the conversion wizard to convert a Load Sharing cluster of Security Gateways.
|
ClusterXL Optimal Service Upgrade
Use the Optimal Service Upgrade feature to upgrade a Security Gateway or VSX cluster from R75.40VS to R76 and future major releases. This feature upgrades the cluster with a minimum loss of connectivity.
When you upgrade the cluster, two cluster members are used to process the network traffic. New connections that are opened during the upgrade procedure are maintained after the upgrade is finished. Connections that were opened on the old version are discarded after the upgrade.
You can also use the Optimal Service Upgrade feature to upgrade a VSX cluster from R67.10 to R76. When you use this feature to upgrade from VSX R67.10, download the R67.10 upgrade hotifx and install it on one VSX cluster member. For more about upgrading to R67.10, see the R67.10 Release Notes.
For more about the Optimal Service Upgrade and to download the R67.10 upgrade hotfix, go to sk74300.
Upgrade Workflow from R75.40VS
This workflow describes how to upgrade a cluster without losing connectivity.
- Old cluster member - This cluster member processes the earlier connections.
- New cluster member - This cluster member is upgraded to R76 and processes new connections.
|
|
|
|
|
Note - Do not use this workflow to upgrade a VSX cluster from R67.10.
|
Diagram of Cluster Members
|
Summary
|
|
- Cluster with four members ().
|
|
- Disconnect all the cluster members from the network, except for the member with the hotfix.
- Configure the
fwha_mac_magic parameter on the member with the hotfix.
|
|
- Upgrade the cluster members to R76 (), except for the old cluster member.
|
|
|
|
- Disconnect the old cluster member from the network.
|
|
- Connect all the cluster members to the network.
- Upgrade the old member to R76
|
Upgrading the Cluster from R75.40VS
Two cluster members are used to maintain connectivity, while you upgrade all the other cluster members.
The default value for the fwha_mac_magic parameter is 254 . If your configuration uses a different value, make sure that you configure the applicable fwha_mac_magic parameter on all the cluster members. For more about the fwha_mac_magic parameter, see the R76 ClusterXL Administration Guide.
To use the Optimal Service Upgrade to upgrade the cluster members:
- Disconnect all cluster members from the network, except for one cluster member.
Make sure that the management interfaces are not connected to the network.
- On the old cluster member (connected to the network), configure the
fwha_mac_magic parameter, run fw ctl set int fwha_mac_magic <value> Make sure that all the cluster members use the same value for the fwha_mac_magic parameter.
- Install R76 on all the cluster members that are not connected to the network.
- On the old cluster member, run
cphaosu start - Reconnect the SYNC interface of one new cluster member to the network.
- Move traffic to the new cluster member that is connected to the network. Do these steps:
- Make sure the new cluster member is in ready state.
- Connect the other new cluster member interfaces to the network.
- On the new cluster member, run
cphaosu start - On the old cluster member, run
cphaosu stat The network traffic statistics are shown.
- When the old cluster member does not have many connections, run
cphaosu finish
- On the new cluster member, run
cphaosu finish - Disconnect the old cluster member from the network.
- Reconnect the other new cluster members to the network one at a time. Do these steps on each cluster member:
- Run
cphastop - Connect the new cluster member to the network.
- Run
cphastart
- Upgrade the old cluster member and reconnect it to the network.
Upgrade Workflow from R67.10 VSX
Use the Optimal Service Upgrade feature to upgrade a VSX cluster from R67.10 to R76 without losing connectivity. When you upgrade the cluster, use two cluster members to process the network traffic.
- Old cluster member - The R67.10 VSX Gateway on which you install the Optimal Service Upgrade hotfix.
- New cluster member - VSX Gateway that is upgraded to R76 and processes new connections.
Diagram of Cluster Members
|
Summary
|
|
- VSX cluster with four R67.10 VSX Gateways ().
- Install the Optimal Service Upgrade hotfix on the cluster member that is connected to the network.
|
|
- Disconnect all the cluster members from the network, except for the member with the hotfix.
- Configure the
fwha_mac_magic parameter on the member with the hotfix.
|
|
- Upgrade the cluster members to R76 (), except for the old hotfix cluster member.
- On the old cluster member, run the hotfix.
|
|
|
|
- Disconnect the old cluster member from the network.
|
|
- Connect all the cluster members to the network.
- Upgrade the old member to R76
|
Upgrading the Cluster R67.10 VSX
Two cluster members are used to maintain connectivity, while you upgrade all the other cluster members.
The default value for the fwha_mac_magic parameter is 254 . If your configuration uses a different value, make sure that you configure the applicable fwha_mac_magic parameter on all the cluster members. For more about the fwha_mac_magic parameter, see the R76 ClusterXL Administration Guide.
To use the Optimal Service Upgrade to upgrade the R67.10 VSX cluster members:
- Install the Optimal Service Upgrade hotfix on a cluster member. This is the old cluster member with hotfix.
Run fw1_HOTFIX_ELSA_HFA_OSU_076_843076007_1
- Disconnect all old cluster members from the network, except for one cluster member.
Make sure that the management interfaces are not connected to the network.
- Configure the
fwha_mac_magic parameter on the old cluster member, run fw ctl set int fwha_mac_magic <value> Make sure that the old and new cluster members use the same value for the fwha_mac_magic parameter.
- Install R76 on all the cluster members that are not connected to the network.
- On the old cluster member, run
cphaosu start - Reconnect the SYNC interface of one new cluster member to the network.
- Move traffic to the new cluster member that is connected to the network. Do these steps:
- Make sure the new cluster member is in ready state.
- Connect the other new cluster member interfaces to the network.
- On the new cluster member, run
cphaosu start - On the old cluster member, run
cphaosu stat The network traffic statistics are shown.
- When the old cluster member does not have many connections, run
cphaosu finish
- On the new cluster member, run
cphaosu finish - Disconnect the old cluster member from the network.
- Reconnect the other new cluster members to the network one at a time. Do these steps on each cluster member:
- Run
cphastop - Connect the new cluster member to the network.
- Run
cphastart
- Upgrade the old cluster member and reconnect it to the network.
Troubleshooting the Upgrade
Use these cphaosu commands if there are problems during the upgrade process.
- If it is necessary to rollback the update, run
cphaosu cancel on the new member. The old member processes all the traffic. - After you run
cpshaosu finish on the old member, you can continue to process the old traffic on the old member and the new traffic on the new member. Run cphaosu restart on the old member.
Limitations
- Upgrade procedure should be implemented when there is minimal network traffic.
- If there is a member failure during the upgrade, the Optimal Service Upgrade procedure does not provide redundancy.
- Do not apply configuration changes during the upgrade process.
- These connections do not survive the upgrade process:
- Complex connections, for example:
- DCE RPC
- SUN RPC
- Back Web
- DHCP
- IIOP
- FreeTel
- WinFrame
- NCP
- VPN
- Dynamic routing
- Bridge mode (L2) configurations
Full Connectivity Upgrade on a ClusterXL Cluster
ClusterXL clusters can be upgraded while at the same time maintaining full connectivity between the cluster members.
Understanding a Full Connectivity Upgrade
The Full Connectivity Upgrade (FCU) method assures that synchronization is possible from old to new cluster members without losing connectivity. A full connectivity upgrade is only supported from R76 to a future minor version that specifically supports FCU.
Connections that have been opened on the old cluster member will continue to "live" on the new cluster member.
In discussing connectivity, cluster members are divided into two categories:
- New Members (NMs): Cluster members that have already been upgraded. NMs are in the "non-active" state.
- Old Members (OMs): Cluster members that have not yet been upgraded. These cluster members are in an "active state" and carry all the traffic.
Supported Upgrade Scenarios
FCU when upgrading and also changing the OS.
Check Point Clustering Solution
OS Type Changing from:
|
ClusterXL
|
IP clustering
|
VRRP
|
SecurePlatform to Gaia
|
No FCU
|
No FCU
|
No FCU
|
Not Changing
|
FCU
|
FCU
|
FCU
|
IPSO to Gaia
|
No FCU
|
No FCU
|
No FCU
|
IPSO to SecurePlatform
|
No FCU
|
No FCU
|
No FCU
|
- Legacy High Availability is not supported in FCU.
- For other third-party support, refer to the third-party documentation.
Full Connectivity Upgrade Prerequisites
Make sure that the new member (NM) and the old member (OM) have the same policy and product installation. During the upgrade, do not change the policy from the last policy installed before this upgrade.
Full Connectivity Upgrade Limitations
Registered connections modules:
No. Name Newconn Packet End Reload Dup Type Dup Handler
0: Accounting 00000000 00000000 d08ff920 00000000 Special d08fed58
1: Authentication d0976098 00000000 00000000 00000000 Special d0975e7c
3: NAT 00000000 00000000 d0955370 00000000 Special d0955520
4: SeqVerifier d091e670 00000000 00000000 d091e114 Special d091e708
6: Tcpstreaming d0913da8 00000000 d09732d8 00000000 None
7: VPN 00000000 00000000 d155a8d0 00000000 Special d1553e48
|
Verify that the list of Check Point Gateway names is the same for both cluster members.
Performing a Full Connectivity Upgrade
The procedure for updating a cluster with full connectivity varies according to the number of members in the cluster.
To upgrade a cluster with two members:
Do the steps outlined in Zero Downtime Upgrade on a ClusterXL Cluster. Before you do step 7 in this section, run this command on the upgraded member:
fw fcu <other member ip on sync network>
(e.g. fw fcu 172.16.0.1 ). Then continue with step 8 of Supported Modes.
To upgrade a cluster with three or more members, do one these:
For more than three members, divide the upgrade of your members so that the active cluster members can handle the amount of traffic during the upgrade.
|
Note - cphastop can also be executed from the Cluster object in the SmartConsole. After cphastop is executed, do not run cpstart or cphastart and do not reboot the machine.
|
Displaying Upgrade Statistics (cphaprob fcustat)
cphaprob fcustat displays statistical information regarding the upgrade process. Run this command on the new member. Typical output looks like this:
During FCU....................... yes
Number of connection modules..... 23
Connection module map (remote -->local)
0 --> 0 (Accounting)
1 --> 1 (Authentication)
2 --> 3 (NAT)
3 --> 4 (SeqVerifier)
4 --> 5 (SynDefender)
5 --> 6 (Tcpstreaming)
6 --> 7 (VPN)
Table id map (remote->local)..... (none or a specific list,
depending on configuration)
Table handlers ..................
78 --> 0xF98EFFD0 (sip_state)
8158 --> 0xF9872070 (connections)
Global handlers ................. none
|
The command output includes the following parameters:
During FCU: This should be "yes" only after running the fw fcu command and before running cphastop on the final OM. In all other cases it should be "no".
Number of connection modules: Safe to ignore.
Connection module map: The output reveals a translation map from the OM to the NM. For additional information, refer to Full Connectivity Upgrade Limitations.
Table id map: This shows the mapping between the gateway's kernel table indices on the OM and on the NM. Having a translation is not mandatory.
Table handlers: This should include a sip_state and connection table handlers. In a security gateway configuration, a VPN handler should also be included.
Global handlers: Reserved for future use.
Display the Connections Table
This command displays the "connection" table. If everything was synchronized correctly the number of entries in this table and the content itself should be approximately the same in the old and new cluster members. This is an approximation because during time that you run the command on the old and new members, new connections may have been created or old connections were deleted.
|
Note - Not all connections are synchronized. For example, local connections and services marked as non-synchronized.
|
Syntax:
fw tab -t connections -u [-s]
Options:
-t - table
-u - unlimited entries
-s - (optional) summary of the number of connections
For more on the fw tab -t connections command, see the Command Line Interface Guide.
You can run the fw fcu command more than once. Be sure to run both cpstop and cpstart on the NM before re-running the fw fcu command. The table handlers that deal with the upgrade are only created during policy installation, and cpstart installs policy.
|