Contrast-agent-operator stuck in InjectionPending

  • Updated

Issue

After the contrast-agent-operator has been installed and configured you may run into issues where the application pods are not getting injected. 

Checking the agent operator logs may show Pod '{namespace}/{podname}' status is still in InjectionPending.

When executing kubectl logs -f deployment/contrast-agent-operator --namespace contrast-agent-operator you should see output similar to the following:

INFO level logging
[2023-06-09 17:34:54.9966 INFO PodTemplateStatusHandler] Pod 'default/petclinic-68b98dd457-nwsbr' status was updated 'None' -> 'InjectionPending'.
[2023-06-09 17:34:55.0594 INFO PodTemplateInjectionHandler] Workload 'DeploymentResource/default/petclinic' will be patched (Injector: 'AgentInjectorResource/default/injector-for-petclinic').
[2023-06-09 17:34:55.1135 INFO MatchInjectorsHandler] Completed re-calculating injection points after 130ms.
[2023-06-09 17:34:55.1135 INFO MergingStateProvider] Merging state modified events until '06/09/2023 17:35:05 +00:00'.
[2023-06-09 17:34:55.1135 INFO MatchInjectorsHandler] Cluster state changed, re-calculating injection points (1 changes merged).
[2023-06-09 17:34:55.1135 INFO MatchInjectorsHandler] Completed re-calculating injection points after 0ms.
[2023-06-09 17:35:05.9885 INFO MatchInjectorsHandler] Cluster state changed, re-calculating injection points (3 changes merged).
[2023-06-09 17:35:05.9885 INFO PodTemplateStatusHandler] Pod 'default/petclinic-7b4d469886-497zt' status was updated 'None' -> 'InjectionPending'.
[2023-06-09 17:35:06.0272 INFO MatchInjectorsHandler] Completed re-calculating injection points after 38ms.
[2023-06-09 17:35:06.0388 INFO MergingStateProvider] Merging state modified events until '06/09/2023 17:35:16 +00:00'.
[2023-06-09 17:35:06.0388 INFO MatchInjectorsHandler] Cluster state changed, re-calculating injection points (1 changes merged).
[2023-06-09 17:35:06.0437 INFO PodTemplateStatusHandler] Pod 'default/petclinic-7b4d469886-497zt' status is still in 'InjectionPending'.

TRACE level events if enabled
Preparing to patch status 'default/demo-app-856c46887c-w4vwh' ('Pod/v1') with '{"lastTransitionTime":"2023-03-01T16:36:17.566548\u002B00:00","message":"The pod is eligible for agent injection, but is currently not injected.","reason":"InjectionPending","status":"False","type":"agents.contrastsecurity.com/injection-converged"}'. [2023-03-01 16:36:17.5883 TRACE ResourcePatcher] Patch complete after 20ms.

 

Cause

If you see output similar to the above, there may be networking issues between the Kubernetes API Server and the application nodes. 

I0410 15:44:52.999878      11 trace.go:205] Trace[976444596]: "Call mutating webhook" configuration:contrast-web-hook-configuration,webhook:pods.agents.contrastsecurity.com,resource:/v1, Resource=pods,subresource:,operation:CREATE,UID:6df5da8e-472b-480f-a45a-8dcd48a43fc2 (10-Apr-2023 15:44:50.999) (total time: 2000ms):
Trace[976444596]: [2.000672907s] [2.000672907s] END
W0410 15:44:52.999918      11 dispatcher.go:180] Failed calling webhook, failing open pods.agents.contrastsecurity.com: failed calling webhook "pods.agents.contrastsecurity.com": failed to call webhook: Post "https://contrast-agent-operator.contrast-agent-operator.svc:443/v1/pods/podmutationwebhook/mutate?timeout=2s": context deadline exceeded
E0410 15:44:52.999933      11 dispatcher.go:184] failed calling webhook "pods.agents.contrastsecurity.com": failed to call webhook: Post "https://contrast-agent-operator.contrast-agent-operator.svc:443/v1/pods/podmutationwebhook/mutate?timeout=2s": context deadline exceeded

Our Operator requires bi-directional network connectivity on port 443 between the API Server and Application nodes. This is expected for admission webhooks.

In EKS:  Check the apiserver logs for these failures related to calling the webhook: 

Another potential issue with network access to the host machine could result in an error message like the following:

Failed calling webhook, failing open pods.agents.contrastsecurity.com: failed calling webhook "pods.agents.contrastsecurity.com": failed to call webhook: Post "https://contrast-agent-operator.contrast-agent-operator.svc:443/v1/pods/podmutationwebhook/mutate?timeout=2s": Address is not allowed

Resolution

The errors indicate a failure to communicate on port 443, even though in this instance the issue is with port 5001.  Due to the clusters configuration, the API Server is attempting to communicate directly with the operator's pod on port 5001 instead of it's service on 443.  

  • The cluster may be configured in such a way that the API server communicates directly with the pods instead of services.  
  • There is a firewall configured between the control plane and application nodes.
    • It may be blocking port 5001 (in the case where the API Server communicates directly with the pods)
    • It may be blocking port 443

To work around firewalls:

If routing between the control plane and application nodes cannot be improved.  It may require port 5001 be allowed from the application nodes to the API server. 

To work around this in EKS security groups:

Locate the security group associated with communication from the Cluster API to the application node groups.   

Create a new custom TCP entry to allow communication between the two:

Once this is saved, try deleting one of the application's pods and check that injection is now working. 

You should see a contrast-init container for the pod and pod logs will show the java agent starting up: 

Picked up JAVA_TOOL_OPTIONS: -javaagent:/opt/contrast/contrast-agent.jar
[Contrast] Fri Jun 09 19:28:11 GMT 2023 Loading pre-packaged configuration
[Contrast] Fri Jun 09 19:28:11 GMT 2023 Couldn't find pre-packaged configuration.
[Contrast] Fri Jun 09 19:28:11 GMT 2023 Starting Contrast (build 5.1.0) Pat. 8,458,789 B2
[Contrast] Fri Jun 09 19:28:12 GMT 2023 Contrast logger configuration errors will be logged to stderr
[Contrast] Fri Jun 09 19:28:13 GMT 2023 Copyright: 2023 Contrast Security, Inc
[Contrast] Fri Jun 09 19:28:13 GMT 2023 Contact: support@contrastsecurity.com
[Contrast] Fri Jun 09 19:28:13 GMT 2023 License: Commercial
[Contrast] Fri Jun 09 19:28:13 GMT 2023 NOTICE: This Software and the patented inventions embodied within may only be used as part of
[Contrast] Fri Jun 09 19:28:13 GMT 2023 Contrast Security's commercial offerings. Even though it is made available through public
[Contrast] Fri Jun 09 19:28:13 GMT 2023 repositories, use of this Software is subject to the applicable End User Licensing Agreement
[Contrast] Fri Jun 09 19:28:13 GMT 2023 found at https://www.contrastsecurity.com/enduser-terms-0317a or as otherwise agreed between
[Contrast] Fri Jun 09 19:28:13 GMT 2023 Contrast Security and the End User. The Software may not be reverse engineered, modified,
[Contrast] Fri Jun 09 19:28:13 GMT 2023 repackaged, sold, redistributed or otherwise used in a way not consistent with the End User
[Contrast] Fri Jun 09 19:28:13 GMT 2023 License Agreement.
[Contrast] Fri Jun 09 19:28:13 GMT 2023 The Contrast Java agent collects usage data in order to help us improve compatibility and security coverage.
[Contrast] Fri Jun 09 19:28:13 GMT 2023 The data is anonymous and does not contain application data. It is collected by Contrast and is never shared.
[Contrast] Fri Jun 09 19:28:13 GMT 2023 You can opt-out of telemetry by setting the CONTRAST_AGENT_TELEMETRY_OPTOUT environment variable to 'true' or '1'
[Contrast] Fri Jun 09 19:28:13 GMT 2023 Read more about Contrast Java agent telemetry: https://docs.contrastsecurity.com/en/java-telemetry.html
[Contrast] Fri Jun 09 19:28:14 GMT 2023 Effective instructions: Assess=false, Protect=true
[Contrast] Fri Jun 09 19:28:14 GMT 2023 Contrast logger configuration errors will be logged to stderr
[Contrast] Fri Jun 09 19:28:19 GMT 2023 Starting JVM [8078ms]

To work around via Terraform: When deploying clusters via Terraform the following security group can be added to alleviate this:

  node_security_group_additional_rules = {
    ingress_allow_access_from_control_plane_contrast = {
      description                   = "Allow contrast_agent_operator readycheck communication from Cluster API to node pods on tcp/5001."
      protocol                      = "tcp"
      from_port                     = 5001
      to_port                       = 5001
      type                          = "ingress"
      source_cluster_security_group = true
    }
  }

To resolve the "Address is not allowed" issue: Try adding hostNetwork: true to the spec.template.spec section of the operator by running the following command:

kubectl -n contrast-agent-operator edit deployment/contrast-agent-operator
You may need to delete the existing operator pod after making the change to enable the new one to start. This can be done by running kubectl -n contrast-agent-operator get pods, then noting the name of the pod, and running: kubectl -n contrast-agent-operator delete pod/<operator name from the prior step>.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request