Troubleshooting the Contrast Agent Operator

  • Updated

Objective

This article aims to provide some suggestions for steps that can be taken to troubleshoot issues with using the Contrast Agent Operator.

Application not Onboarded to the Contrast UI

If all of the documented steps for installing and configuring the Contrast Agent Operator have been followed but the application has not been onboarded, there are several things to check for.

The first of these to rule out is whether the operator has not been correctly installed and/or configured.  The following steps should help with that:

Are all the necessary resources deployed and in the correct namespaces?

Run the following command:

kubectl get secrets,clusteragentconnections,agentinjectors --all-namespaces

and verify that output looks something like this:

 

These three resources are required at a minimum for the operator to function correctly:

  • The Secret exists and is deployed to the contrast-agent-operator namespace.
  • The ClusterAgentConnection exists and is deployed to the contrast-agent-operator namespace.
  • The AgentInjector exists and is deployed to the same namespace as the application to be instrumented.

Is the application deployment correctly tagged?

The AgentInjector should contain a label definition under spec.selector.labels that indicates to the Agent Operator which deployments should be injected with the agent.

 For example, this kubectl command will display the AgentInjector manifest:

kubectl get agentinjector webgoatdotnetcore -o yaml --namespace default

Note the label definition.  This tells the operator that any deployment tagged with contrast-agent=dotnet-core should be injected.  The name and value here are arbitrary and can be anything you choose.  Glob patterns are also supported.

Now, verify that the application deployment correctly specifies the same label. Note that there are several labels that can be defined in an application deployment - but the important one here is metadata.labels.  You can see the corresponding label here:

kubectl get deployment webgoatdotnetcore -o yaml

The following kubectl command will return the corresponding label unambiguously:

kubectl get deployments --show-labels

Note that, in general, you can review any of the resources currently deployed to K8s, using this command structure:

kubectl get [resource type] [resource name] -o yaml --namespace [namespace]

For example:

kubectl get clusteragentconnection default-agent-connection -o yaml --namespace contrast-agent-operator

Logs and Metrics

The Agent Operator logs

The Agent Operator runs as a pod in the contrast-agent-operator namespace. You can view or tail logs from this pod to look for problems with agent injectors, configurations, and connections.

This kubectl command will display the logs for the deployment:

kubectl logs -f deployment/contrast-agent-operator --namespace contrast-agent-operator

Here's an example showing the operator startup followed by checking available pods for patching and then a successful injection on a pod:

[2023-06-23 19:48:36.0774 INFO Program] Starting the Contrast Security Agent Operator 1.0.0.0.
[2023-06-23 19:48:37.1566 INFO OptionsLogger] Option 'install-source' was changed from 'unknown' (default) -> 'kustomize'.
[2023-06-23 19:48:37.9095 INFO ApplicationStartup] Registered mutation webhook "contrast.k8s.agentoperator.controllers.v1pod.podmutationwebhook" under "/v1/pods/podmutationwebhook/mutate".
[2023-06-23 19:48:38.2095 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/contrast-agent-operator/contrast-agent-operator-5545877df8-8kjg7' was reconciled.
[2023-06-23 19:48:38.2150 INFO MergingStateProvider] Merging state modified events until '06/23/2023 19:48:48 +00:00'.
[2023-06-23 19:48:38.2222 INFO MatchInjectorsHandler] Reactions are disabled, cluster state is settling or instance is not leading.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/contrast-agent-operator/contrast-agent-operator-6547f5c6d8-x2qxg' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/default/webgoatdotnetcore-6f745d4b6b-7ndsj' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/aws-node-2cglk' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/aws-node-s8zlp' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/coredns-5c5677bc78-m7krb' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/coredns-5c5677bc78-szqr7' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/kube-proxy-79zl5' was reconciled.
[2023-06-23 19:48:38.2398 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/kube-system/kube-proxy-7zq7k' was reconciled.
[2023-06-23 19:48:38.2575 TRACE ClusterIdHandler] Internal cluster id was updated. (Generated: 2023-02-23T15:49:04.6346830+00:00)
[2023-06-23 19:48:38.2635 DEBUG BaseApplier`2:SecretResource] Resource 'SecretResource/contrast-agent-operator/contrast-cluster-id' was reconciled.
[2023-06-23 19:48:38.2651 DEBUG BaseApplier`2:ClusterAgentConnectionResource] Resource 'ClusterAgentConnectionResource/contrast-agent-operator/default-agent-connection' was reconciled.
.....
.....
[2023-06-23 19:50:26.4766 TRACE BaseSyncingHandler`3:ClusterAgentConnectionResource] Checking for cluster 'AgentConnectionSecret' eligible for generation across 1 templates in 3 namespaces.
[2023-06-23 19:50:26.4766 TRACE BaseSyncingHandler`3:ClusterAgentConnectionResource] Completed checking for entity generation after 5ms.
[2023-06-23 19:50:26.4766 TRACE BaseSyncingHandler`3:ClusterAgentConnectionResource] Checking for cluster 'AgentConnection' eligible for generation across 1 templates in 3 namespaces.
[2023-06-23 19:50:26.4766 TRACE BaseSyncingHandler`3:ClusterAgentConnectionResource] Completed checking for entity generation after 0ms.
[2023-06-23 19:50:26.5361 DEBUG PodMutationWebhook] Admission with method "CREATE".
[2023-06-23 19:50:26.5451 TRACE PodPatcher] Selected agent injector 'DotNetCore'.
[2023-06-23 19:50:26.5543 TRACE GlobMatcher] Compiling glob pattern '*'.
[2023-06-23 19:50:26.5596 INFO PodInjectionHandler] Patching pod from 'default/webgoatdotnetcore' using injector 'default/webgoatdotnetcore'.
[2023-06-23 19:50:26.6298 DEBUG PodMutationWebhook] AdmissionHook "contrast.k8s.agentoperator.controllers.v1pod.podmutationwebhook" did return "True" for "CREATE".
[2023-06-23 19:50:26.6585 DEBUG PodMutationWebhook] Admission with method "CREATE".
[2023-06-23 19:50:26.6585 TRACE PodPatcher] Selected agent injector 'DotNetCore'.
[2023-06-23 19:50:26.6585 INFO PodInjectionHandler] Patching pod from 'default/webgoatdotnetcore' using injector 'default/webgoatdotnetcore'.
[2023-06-23 19:50:26.6643 DEBUG PodMutationWebhook] AdmissionHook "contrast.k8s.agentoperator.controllers.v1pod.podmutationwebhook" did return "True" for "CREATE".
[2023-06-23 19:50:26.6800 DEBUG BaseApplier`2:PodResource] Resource 'PodResource/default/webgoatdotnetcore-569f49b79d-swzmn' was reconciled.
[2023-06-23 19:50:35.9086 DEBUG BaseApplier`2:PodResource] Resource 'default/webgoatdotnetcore-6f745d4b6b-7ndsj' of type 'PodResource' was deleted.
[2023-06-23 19:50:36.5137 TRACE MergingStateProvider] Flushing state modified, 2 events were merged.
[2023-06-23 19:50:36.5137 INFO MatchInjectorsHandler] Cluster state changed, re-calculating injection points (2 changes merged).
[2023-06-23 19:50:36.5137 TRACE MatchInjectorsHandler] Calculating changes needed for 'DeploymentResource/contrast-agent-operator/contrast-agent-operator'...
[2023-06-23 19:50:36.5137 TRACE MatchInjectorsHandler] Calculating changes needed for 'DeploymentResource/default/webgoatdotnetcore'...
[2023-06-23 19:50:36.5137 INFO PodTemplateStatusHandler] Pod 'default/webgoatdotnetcore-569f49b79d-swzmn' status was updated 'None' -> 'InjectionComplete'.
[2023-06-23 19:50:36.5477 TRACE ResourcePatcher] Preparing to patch status 'default/webgoatdotnetcore-569f49b79d-swzmn' ('Pod/v1') with '{"lastTransitionTime":"2023-06-23T19:50:36.516960\u002B00:00","message":"The pod is eligible for agent injection and is currently injected.","reason":"InjectionComplete","status":"True","type":"agents.contrastsecurity.com/injection-converged"}'.
[2023-06-23 19:50:36.6026 TRACE ResourcePatcher] Patch complete after 67ms.

If the operator logs indicate status is still in InjectionPending this article should help in tracking down the issue: Contrast-agent-operator stuck in InjectionPending.

Enabling more verbose logging for the Agent Operator

The log snippet above shows output at the TRACE logging level.  The default is INFO. See How to get logs from the Agent Operator for detail on configuring more detailed operator logging.

Getting cluster event logs

These may provide additional insight into problems when injectors are not working.

kubectl get events

The application deployment logs

You can utilize the regular STDOUT on pods to get an indication of whether the agent was successfully injected.

For example - to show the logs for a given deployment:

kubectl logs -f deployment/webgoatdotnetcore

Or to use the pod name - first, fetch the name:

kubectl get pods

kubectl logs pods/webgoatdotnetcore-569f49b79d-swzmn

An injected pod has two containers - one is the contrast-init container.  You can view its logs like so:

kubectl logs pods/webgoatdotnetcore-569f49b79d-swzmn -c contrast-init

Control plane logging

This may be more difficult to get and requires a cluster administrator. This is logging on the entire cluster, Kubernetes APIs, controllers, schedulers and auditors.

If access is available, you are looking to get recent API server and Control manager logging.

Some things to look for:

  • Kubernetes is unable to contact the operator
    • The issue could be due to non-standard security policies or configurations.
  • Focus on webhook-related errors
  • If there are no webhook-related errors, the issue could, again, be related to security policies or configurations.

Agent Logs

If all of the above checks out and it appears that the agent is being injected successfully, but the application is still not showing up in the Contrast UI, the next place to look would be the agent logs themselves.  Connect a terminal to the running pod - for example:

kubectl exec --stdin --tty [pod name] -- /bin/sh

and the agent logs can be found in /contrast/data/logs.

Performance Metrics

The Agent Operator generates metrics that can be accessed using an API endpoint as follows:

kubectl logs -f deployment/contrast-agent-operator --namespace contrast-agent-operator

Output will look something like this:

 % Total % Received % Xferd Average Speed Time     Time     Time     Current 
Dload Upload Total Spent Left Speed
100 2579 0 2579 0 0 59022 0 --:--:-- --:--:-- --:--:-- 59976

Followed by a json string that when prettified looks like this:

  {
      "Injected.Java.PodsCount": 16,
      "Injected.NodeJs.PodsCount": 2,
      "Injected.PodsCount": 18,
      "Performance.AllocationRate": 1516244112,
      "Performance.CPUUsage": 3,
      "Performance.ExceptionCount": 2,
      "Performance.GCCommittedBytes": 325,
      "Performance.GCFragmentation": 57.8423222582457,
      "Performance.GCHeapSize": 150,
      "Performance.Gen0GCCount": 19,
      "Performance.Gen0Size": 24,
      "Performance.Gen1GCCount": 3,
      "Performance.Gen1Size": 732488,
      "Performance.Gen2GCCount": 1,
      "Performance.Gen2Size": 177961408,
      "Performance.ILBytesJitted": 1402364,
      "Performance.LOHSize": 60709424,
      "Performance.MonitorLockContentionCount": 12,
      "Performance.NumberofActiveTimers": 19,
      "Performance.NumberofAssembliesLoaded": 187,
      "Performance.NumberofMethodsJitted": 21170,
      "Performance.PercentTimeinGCsincelastGC": 0,
      "Performance.POHPinnedObjectHeapSize": 289688,
      "Performance.ThreadPoolCompletedWorkItemCount": 1566,
      "Performance.ThreadPoolQueueLength": 0,
      "Performance.ThreadPoolThreadCount": 6,
      "Performance.TimespentinJIT": 0,
      "Performance.WorkingSet": 523,
      "Resources.AgentConfigurationResource.NamespacesCount": 37,
      "Resources.AgentConfigurationResource.ResourcesCount": 74,
      "Resources.AgentConnectionResource.NamespacesCount": 37,
      "Resources.AgentConnectionResource.ResourcesCount": 37,
      "Resources.AgentInjectorResource.NamespacesCount": 37,
      "Resources.AgentInjectorResource.ResourcesCount": 148,
      "Resources.ClusterAgentConnectionResource.NamespacesCount": 1,
      "Resources.ClusterAgentConnectionResource.ResourcesCount": 1,
      "Resources.DaemonSetResource.NamespacesCount": 16,
      "Resources.DaemonSetResource.ResourcesCount": 24,
      "Resources.DeploymentConfigResource.NamespacesCount": 439,
      "Resources.DeploymentConfigResource.ResourcesCount": 4564,
      "Resources.DeploymentResource.NamespacesCount": 104,
      "Resources.DeploymentResource.ResourcesCount": 251,
      "Resources.Global.NamespacesCount": 543,
      "Resources.Global.ResourcesCount": 27020,
      "Resources.PodResource.NamespacesCount": 475,
      "Resources.PodResource.ResourcesCount": 2852,
      "Resources.SecretResource.NamespacesCount": 543,
      "Resources.SecretResource.ResourcesCount": 19086,
      "Resources.StatefulSetResource.NamespacesCount": 9,
      "Resources.StatefulSetResource.ResourcesCount": 30,
      "UptimeSeconds": 60427.7069415,
      "Process.WorkingSet64": 524357632,
      "Process.MinWorkingSet": 0,
      "Process.MaxWorkingSet": 2147483648,
      "Process.PeakWorkingSet64": 909082624,
      "Process.PrivateMemorySize64": 611205120,
      "Process.VirtualMemorySize64": 10074427392,
      "Process.PeakVirtualMemorySize64": 10326552576,
      "Process.PagedMemorySize64": 0,
      "Process.PeakPagedMemorySize64": 0,
      "Process.NonpagedSystemMemorySize64": 0,
      "Process.TotalProcessorTime": "00:56:53.1000000",
      "Process.UserProcessorTime": "00:48:32.6300000",
      "Process.PrivilegedProcessorTime": "00:08:20.4700000",
      "Process.Thread": 16,
      "Process.Modules": 155,
      "IsLeader": "True"
  }

Also useful is the output of:

kubectl describe deployment/contrast-agent-operator --namespace contrast-agent-operator

Which will provide something like this:

Name:                  contrast-agent-operator
Namespace: contrast-agent-operator
CreationTimestamp: Mon, 28 Oct 2024 15:55:54 -0400
Labels: app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=operator
app.kubernetes.io/part-of=contrast-agent-operator
Annotations: deployment.kubernetes.io/revision: 4
meta.helm.sh/release-name: contrast-agent-operator
meta.helm.sh/release-namespace: contrast-agent-operator
Selector: app.kubernetes.io/name=operator,app.kubernetes.io/part-of=contrast-agent-operator
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/name=operator
app.kubernetes.io/part-of=contrast-agent-operator
Service Account: contrast-agent-operator-service-account
Containers:
contrast-agent-operator:
Image: contrast/agent-operator:1.5.4
Port: 5001/TCP
Host Port: 0/TCP
Limits:
cpu: 2
memory: 512Mi
Requests:
cpu: 500m
memory: 256Mi
Liveness: http-get https://:5001/health delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:5001/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
CONTRAST_DEFAULT_REGISTRY: contrast
CONTRAST_SETTLE_DURATION: 10
CONTRAST_EVENT_QUEUE_SIZE: 10000
CONTRAST_EVENT_QUEUE_FULL_MODE: DropOldest
CONTRAST_WEBHOOK_SECRET: contrast-web-hook-secret
CONTRAST_WEBHOOK_CONFIGURATION: contrast-web-hook-configuration
CONTRAST_ENABLE_EARLY_CHAINING: false
CONTRAST_INSTALL_SOURCE: helm
CONTRAST_INITCONTAINER_CPU_REQUEST: 100m
CONTRAST_INITCONTAINER_CPU_LIMIT: 100m
CONTRAST_INITCONTAINER_MEMORY_REQUEST: 64Mi
CONTRAST_INITCONTAINER_MEMORY_LIMIT: 64Mi
POD_NAMESPACE: (v1:metadata.namespace)
CONTRAST_WEBHOOK_SERVICENAME: contrast-agent-operator
CONTRAST_WEBHOOK_HOSTS: $(CONTRAST_WEBHOOK_SERVICENAME),$(CONTRAST_WEBHOOK_SERVICENAME).$(POD_NAMESPACE).svc,$(CONTRAST_WEBHOOK_SERVICENAME).$(POD_NAMESPACE).svc.cluster.local
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: contrast-agent-operator-85ffdb79b8 (0/0 replicas created)
NewReplicaSet: contrast-agent-operator-58f844bf6b (1/1 replicas created)
Events: <none>

These outputs can provide invaluable detail when troubleshooting resource issues.

Related Articles

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request