Connection Draining in CloudGuard Network for Azure VMSS

Connection draining (also referred to as "instance draining" or "graceful termination") is a mechanism used to gracefully remove compute instances from service while maintaining application availability.

It ensures that when a virtual machine (VM) is removed from a Scale Set due to rolling updates or health check failures, existing connections are allowed to complete before the instance is terminated.

Azure does not natively support a built-in connection draining feature on the Standard Load Balancer. Instead, the Load Balancer stops directing new connections to a backend instance once it fails to respond to health probe checks.

However, existing TCP connections to that instance will remain active until they are closed by the application or until the configured idle timeout on the corresponding inbound or outbound rule Set of traffic parameters and other conditions in a Rule Base (Security Policy) that cause specified actions to be taken for a communication session. is reached.

Connection Draining Overview

Connection draining is a key capability for enabling graceful shutdown or decommissioning of compute instances, such as VMSS instances behind a Load Balancer. It allows in-flight requests to complete before the instance is removed from service, minimizing service disruption during maintenance events or manual interventions.

A connection draining mechanism typically involves:

Preventing new incoming connections to the instance.
Allowing existing connections to continue for a defined period.
Gracefully removing the instance from the backend pool after ongoing connections are completed.

Connection Draining Use Cases

Application upgrades

Connection draining allows in-flight requests to complete before shutting down a VM.
Rolling deployments

Connection draining helps avoid dropped connections during version transitions.
Maintenance events

Connection draining provides graceful handling of VMs during host updates or reboots.

Connection Draining During Instance Reboot

When preparing to reboot a CloudGuard Network instance in Azure Virtual Machine Scale Sets (VMSS), make sure the new connections are no longer forwarded to this instance and existing connections are gracefully completed.

To achieve this, the CloudGuard Network VMSS instance should be configured to stop responding to health probe requests from the Azure Load Balancer. This can be done by disabling the health probe port (usually, 8117) using the kernel variable: cloud_balancer_port.

When the CloudGuard Network VMSS instance stops responding to health probes, the Load Balancer marks the instance as unhealthy and stops routing new connections to it.

Existing TCP connections remain active and continue to be processed by the CloudGuard Network Security Gateway Dedicated Check Point server that runs Check Point software to inspect traffic and enforce Security Policies for connected network resources. until they reach the configured idle timeout defined in the Load Balancerrule.

This allows in-flight sessions to complete without interruption, ensuring a degree of connection draining even in the absence of native support for this feature.

The idle timeout value on the Azure Load Balancer can be configured between 4 and 100 minutes, and it defines how long an inactive connection is maintained before being closed.

Load Balancer Type	Default TCP Idle Timeout	Configurable Value
Standard LB Outbound rule	4 minutes	100 minutes
Standard LB Inbound rule	4 minutes	100 minutes

Follow these steps for a graceful reboot:

Connect via SSH to the CloudGuard Network VMSS instance that needs maintenance or reboot.
Switch to Expert mode.
To view the current number of active connections, run:

fw tab -t connections -s
Check the current Load Balancer health probe port number with this command:

fw ctl get int cloud_balancer_port

The default port number is 8117.
Disable CloudGuard Network VMSS instance responses to Load Balancer health probes. This stops the creation of new connections. For that, run

fw ctl set int cloud_balancer_port 0

Monitor traffic drain with this command:

fw tab -t connections -s

Wait until the number of active connections decreases significantly.

Note - System connections usually persist so the value will never be 0.

When connections have been reduced, proceed with reboot or other planned maintenance on the instance.
To return the instance to service after maintenance, re-enable the health probing with this command:

fw ctl set int cloud_balancer_port 8117

Note - After the instance is rebooted, the health probe port automatically starts responding again. To prevent the automatic probe recovery, set cloud_balancer_port=0 in the kernel parameters configuration file (kern.conf).

Refer to sk26202 for instructions on modifying the kernel configuration file.

Connection Draining During Instance Termination

When a CloudGuard Network Security Gateway instance in Azure Virtual Machine Scale Sets (VMSS) terminates abruptly, it is immediately removed from service. As a result:

The Azure Load Balancer does not have the opportunity to mark the CloudGuard Network VMSS instance as unhealthy.
New connections are no longer routed to the CloudGuard Network VMSS instance after it is removed from the backend pool.
All existing TCP/UDP connections are instantly dropped, regardless of their state or duration of inactivity period.
No grace period is provided for in-flight connections to complete.

Graceful Termination

To minimize service disruptions, we recommend using graceful termination and draining connections of the CloudGuard Network VMSS instance before initiating termination. For that, you need to:

Manually disable the health probe responses on the CloudGuard Network VMSS instance.
Wait for connections to drain based on the idle timeout.
Only proceed with termination when the number of active connections is minimal.

Follow these steps:

Connect to the CloudGuard Network VMSS instance via SSH.
Switch to Expert mode.
To view the current number of active connections, run:

fw tab -t connections -s
Check the current Load Balancer health probe port number with this command:

fw ctl get int cloud_balancer_port

The default port number is 8117.
Disable CloudGuard Network VMSS instance responses to Load Balancer health probes. This stops the creation of new connections. For that, run

fw ctl set int cloud_balancer_port 0

Monitor traffic drain with this command:

fw tab -t connections -s

Wait until the number of active connections decreases to a minimum.

Note - System connections usually persist, so the value will never be 0.

In the Azure portal, go to VMSS > Instances and select the CloudGuard Network VMSS instance you want to terminate. Click Delete.