Skip to content

SSABasedGenericKubernetesResourceMatcher Incorrectly Detects Mismatch for initialDelaySeconds=0 in Deployment ReadinessProbe #2742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kamilchociej opened this issue Mar 27, 2025 · 12 comments

Comments

@kamilchociej
Copy link

Bug Report

What did you do?

  1. Define a Deployment CRUDKubernetesDependentResource in a Java Operator SDK controller with desired resource containing spec.template.spec.containers[0].readinessProbe.initialDelaySeconds=0.
  2. Trigger reconciliation.
  3. Observe that the controller repeatedly considers the resource as out-of-sync, even though there is no actual change needed.

The SSABasedGenericKubernetesResourceMatcher in Java Operator SDK incorrectly considers a Deployment resource as not matching the actual Kubernetes resource when spec.template.spec.containers[*].readinessProbe.initialDelaySeconds is set to 0.

Kubernetes omits fields with default values (0 for initialDelaySeconds) when storing resources, meaning the actual resource map retrieved from Kubernetes does not contain this field. However, the desired resource map includes initialDelaySeconds: 0, leading to an unnecessary update being triggered.

What did you expect to see?

If initialDelaySeconds: 0 is absent in the actual resource due to Kubernetes' defaulting behavior, the matcher should recognize this as equivalent to explicitly setting it to 0 in the desired resource.

What did you see instead? Under which circumstances?

The matcher detects a difference because the desired resource contains initialDelaySeconds: 0, while the actual resource lacks this field. This results in an unnecessary update being applied.

Environment

Kubernetes cluster type:

vanilla

$ Mention java-operator-sdk version from pom.xml file

4.9.7

$ java -version

21

$ kubectl version

v1.30.4

Possible Solution

Modify SSABasedGenericKubernetesResourceMatcher to handle cases where Kubernetes omits fields with default values, treating their absence as equivalent to explicitly setting the default value.

Additional context

This issue may affect other defaulted fields in Kubernetes objects, not just initialDelaySeconds. A general approach to handling default values might be necessary.

@csviri
Copy link
Collaborator

csviri commented Mar 27, 2025

Thx @kamilchociej , I think the current workaround is to fill the defaults in you current resources.

@csviri
Copy link
Collaborator

csviri commented Mar 27, 2025

Just to make sure I understand, this does not result in an infinite loop in your side, just an additional reconciliation right?

@kamilchociej
Copy link
Author

It is infinite reconciliation loop. The problem is that the desired resource contains the default (0) value while k8s does not contain the default value (0). I have no control over actual resource in io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.

@kamilchociej
Copy link
Author

Default value is removed from resource definition on k8s side while the property remains as managed field in controller managed fields entry.

@kamilchociej
Copy link
Author

Whole process looks like:

  1. Desired Deployment contains containing spec.template.spec.containers[0].readinessProbe.initialDelaySeconds=0.
  2. Controller applies desired state.
  3. K8s applies the change but returns Deployment without spec.template.spec.containers[0].readinessProbe.initialDelaySeconds
  4. Controller is notified about the Update
  5. Desired Deployment contains spec.template.spec.containers[0].readinessProbe.initialDelaySeconds=0
  6. Actual Deployment does not contain spec.template.spec.containers[0].readinessProbe.initialDelaySeconds
  7. SSABasedGenericKubernetesResourceMatcher returns false
  8. Deployment with spec.template.spec.containers[0].readinessProbe.initialDelaySeconds=0 is applied
  9. All previous steps staring from 4th happen again.

@kamilchociej
Copy link
Author

It looks like 0 for Integer is threaten as null by k8s but difference between setting 0 and null is that if you set 0 then field is marked as managed by operator controller.

@csviri
Copy link
Collaborator

csviri commented Mar 27, 2025

can you try to set previousAnnotationForDependentResourcesEventFiltering to false, if it helps with infinite loop?

@kamilchociej
Copy link
Author

Where can I set this?

@kamilchociej
Copy link
Author

kamilchociej commented Mar 27, 2025

Following logic applied on each desired Probe resolves the issue:

  private void sanitizeProbe(Probe probe) {
    if (probe != null) {
      if (Objects.equals(0, probe.getInitialDelaySeconds())) {
        probe.setInitialDelaySeconds(null);
      }
      if (Objects.equals(0, probe.getPeriodSeconds())) {
        probe.setPeriodSeconds(10);
      }
      if (Objects.equals(0, probe.getTimeoutSeconds())) {
        probe.setTimeoutSeconds(1);
      }
      if (Objects.equals(0, probe.getSuccessThreshold())) {
        probe.setSuccessThreshold(1);
      }
      if (Objects.equals(0, probe.getFailureThreshold())) {
        probe.setFailureThreshold(3);
      }
      if (Objects.equals(0L, probe.getTerminationGracePeriodSeconds())) {
        probe.setTerminationGracePeriodSeconds(null);
      }
    }
  }

This is ugly workaround but at least it works for Probes.

@csviri
Copy link
Collaborator

csviri commented Mar 27, 2025

can you try to set previousAnnotationForDependentResourcesEventFiltering to false, if it helps with infinite loop?

default boolean previousAnnotationForDependentResourcesEventFiltering() {

within ConfigurationService, similar to this:

Operator operator = new Operator(o -> o.withStopOnInformerErrorDuringStartup(false));

@csviri
Copy link
Collaborator

csviri commented Mar 27, 2025

That should fix the infinite loop.

@csviri
Copy link
Collaborator

csviri commented Mar 31, 2025

closing this as duplicate: #2553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants