Troubleshooting Kubernetes Onboarding

Deploying the Agent

  1. To see if the podClosed The smallest and simplest Kubernetes object. A pod represents a set of running containers on your cluster. A pod is typically set up to run a single primary container. It can also run optional sidecar containers that add supplementary features like logging. Pods are commonly managed by a Deployment. exists, run:

    kubectl -n <namespace> get pods

    If the pod exists, but is not ready, get the pod's details:

    kubectl describe -n <namespace> pod <pod_name>

  2. Do a review of the KubernetesClosed Kubernetes, often abbreviated as “K8s”, orchestrates containerized applications to run on a cluster of hosts. logs for error(s). To get the logs, run:

    kubectl -n <namespace> logs <pod_name> [-c container_name]

  3. Make sure the pods have connectivity to CloudGuard.

    The pod must have HTTPS (port 443) connectivity to https://api-cpx.dome9.com.

    Check these entities for possible configuration issues that prevent connectivity to the agent:

    • Proxy

    • Network Policy

    • Security Groups

    • Firewall rules

    You can install a different pod that has curl (such as an Alpine pod, for example) in the same namespace, with the same labels, exec into it, and do a connectivity check to the above URL with curl.

    apk update ; apk add curl ; curl -k https://api-cpx.dome9.com/namespaces -X POST

    If there is connectivity with the CloudGuard backend, the response is {"message":"Unauthorized"}.

  4. Make sure the nodes can connect to the Image Registry.

    The node needs HTTPS (port 443) connectivity to the Quay registry. If you see an image pull error, make a connectivity check to:

    https://quay.io/checkpoint/

Cluster Behind a Gateway

If the traffic passes from the cluster to the Internet through a Security Gateway with HTTPS inspection, you have to configure a customer CA (Certificate Authority) certificate for the agents.

  1. Put the customer Base64 PEM-encoded CA certificate in a configmap in the applicable namespace.

    For example, if the CA certificate is in file ca.cer:

    kubectlClosed Command line tool for communicating with a Kubernetes cluster's control plane, using the Kubernetes API. You can use kubectl to create, inspect, update, and delete Kubernetes objects. -n <namespace> create configmap ca-store --from-file= ca.cer=<PATH_TO_CA_CERTIFICATE_FILE>

  2. Install the file on the containers at /etc/ssl/cert.pem.

    1. To add a volume, edit the applicable workload:

      Copy
      - name: ca-volume
        configMap:
          name: ca-store
    2. Add mountPath below volumeMount for the applicable container in the workload:

      Copy
      - name: ca-volume
        mountPath: /etc/ssl/cert.pem
        subPath: ca.cer

Blocked or Unreported Clusters

It is necessary to onboard each cluster with its Environment ID. If the same Environment ID is used for several clusters in parallel, then they are blocked and do not report. An example of this is seen in the agent's logs:

[error] api-cpx.dome9.com:443, HTTP status=403

cloudAccount marked as blocked

An equivalent error shows on the Environment page of the CloudGuard portal.

To correct the issue:

  1. Offboard CloudGuard agents that share the same ID from the unnecessary cluster with this command:

    helm uninstall asset-mgmt --namespace <namespace>

  2. After you offboard the unnecessary clusters, use the API request to correct multiple onboardings and make sure the issue is resolved.

Important - The API request requires special CloudGuard privileges. The credentials (username and password) in this request must belong to a CloudGuard Service Account with Manage Resources permission. This is opposed to the username and password used when you configured the agents, which only allows the data to be reported to the backend. To configure the privileges, follow the steps below.

To configure CloudGuard privileges for Manage Resources:

  1. In the CloudGuard menu, navigate to Settings > Service Accounts and select Add Account.

  2. In Selected Roles, select a role with the Manage Resources permissions or create a new role with these permissions.

    For more information about roles and service accounts, see Roles.

Installation of Agents Fails in Clusters with OPA Gatekeeper

When OPA (Open Policy Agent) Gatekeeper is configured in a cluster with custom block policies, the installation of the CloudGuard agents can fail, for example, because of the required permissions. For a successful installation, you must exclude the CloudGuard agents from OPA Gatekeeper enforcement.

For this, add an exclusion of the CloudGuard agents namespace to the Gatekeeper configuration as described in the Gatekeeper documentation.

  • Create Gatekeeper config with the statement below:

    Copy
    apiVersion: config.gatekeeper.sh/v1alpha1
    kind: Config
    metadata:
      name: config
      namespace: GATEKEEPER-NAMESPACE
    spec:
      match:
        - excludedNamespaces: ["CLOUDGUARD-NAMESPACE"]
          processes: ["*"]
  • If the Gatekeeper config exists, update it to include the statement below:

    Copy
     - excludedNamespaces: ["CLOUDGUARD-NAMESPACE"]
          processes: ["*"]

Notes:

  • Change GATEKEEPER-NAMESPACE to the Gatekeeper installation namespace.

  • Change CLOUDGUARD-NAMESPACE to the CloudGuard installation namespace.

How to Enable Debugging

  1. Edit the deployment, and set the debug level:

    kubectl -n <namespace> set env deployment <deployment> LOG_LEVEL=debug

  2. Make sure the logs are correct.

How to Collect CloudGuard Container Release Information

To collect this information for more troubleshooting, download the cloudguard-container-info-collect-v2.sh shell script from the Download Center. Before you run the script, use this command:

chmod +x cloudguard-container-info-collect-v2.sh

Script prerequisites:

  • The user must have kubectl and helm installed on the server that runs this script and kubeconfig context set to the related cluster.

  • The user must have the correct permissions to run the helm and kubectl commands for the applicable cluster.

  • These common Linux commands must be available: rm, tar, and mkdir.

Default assumptions:

The script collects CloudGuard CRDs (Custom Resource Definitions) ./cloudguard-container-info-collect.sh

Syntax:

./cloudguard-container-info-collect.sh [-h | -c | -d | -m | - n | -o | -r]"

Parameters:

Parameter

Description

-h/--help

Shows the built-in help.

{-c | --crd} yes

{-c | --crd} no

Specifies if the script has to collect CloudGuard CRDs.

Default value: yes.

{-d | --debug}

Runs the script in debug mode.

{-m | --metrics}

Specifies if the script has to collect metrics for the CloudGuard agent containers.

If metrics collection is enabled, the 'kubectl exec' runs on fluentbit containers.

Default: disabled.

{-n | --namespace} <namespace>

Specifies the namespace.

-o <Name of Output File>

Specifies the custom name for the output TAR archive.

{-r | --release} <Name of Helm Release>}

Specifies the Helm release name.

More Links

API Reference Guide

Gatekeeper documentation