Skip to content

failed invoking InformerEventSource event handler: null: java.lang.NullPointerException #2694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
matteriben opened this issue Feb 21, 2025 · 4 comments · Fixed by #2696
Closed
Assignees

Comments

@matteriben
Copy link
Contributor

I'm updating to version 5. I ran an existing test that creates many custom resources which need to be reconciled and noticed that I would occasionally see these errors in the logs.

2025-02-21 12:11:43,724 ERROR [io.fab.kub.cli.inf.imp.cac.SharedProcessor] (-1324526451-pool-1-thread-3) v1/configmaps failed invoking InformerEventSource{resourceClass: ConfigMap} event handler: null: java.lang.NullPointerException
	at java.base/java.util.Objects.requireNonNull(Objects.java:233)
	at io.quarkus.vertx.mdc.provider.LateBoundMDCProvider.putLocal(LateBoundMDCProvider.java:147)
	at io.quarkus.vertx.mdc.provider.LateBoundMDCProvider.put(LateBoundMDCProvider.java:63)
	at org.jboss.logmanager.MDC.put(MDC.java:90)
	at org.slf4j.impl.Slf4jMDCAdapter.put(Slf4jMDCAdapter.java:32)
	at org.slf4j.MDC.put(MDC.java:123)
	at io.javaoperatorsdk.operator.processing.MDCUtils.addResourceInfo(MDCUtils.java:36)
	at java.base/java.util.Optional.ifPresent(Optional.java:178)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.submitReconciliationExecution(EventProcessor.java:128)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.handleMarkedEventForResource(EventProcessor.java:119)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.handleEvent(EventProcessor.java:109)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.lambda$propagateEvent$4(InformerEventSource.java:230)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.propagateEvent(InformerEventSource.java:220)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.onAddOrUpdate(InformerEventSource.java:175)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.onUpdate(InformerEventSource.java:140)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource.onUpdate(InformerEventSource.java:68)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorListener$UpdateNotification.handle(ProcessorListener.java:96)
	at io.fabric8.kubernetes.client.informers.impl.cache.ProcessorListener.add(ProcessorListener.java:50)
	at io.fabric8.kubernetes.client.informers.impl.cache.SharedProcessor.lambda$distribute$0(SharedProcessor.java:92)
	at io.fabric8.kubernetes.client.informers.impl.cache.SharedProcessor.lambda$distribute$1(SharedProcessor.java:115)
	at io.fabric8.kubernetes.client.utils.internal.SerialExecutor.lambda$execute$0(SerialExecutor.java:57)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

I verified this behavior with both 5.0.1 and 5.0.2. I believe is this caused by a data race between the two points below.

Write originalResource.getMetadata().setResourceVersion(null):

originalResource.getMetadata().setResourceVersion(null);
resource.getMetadata().setResourceVersion(null);
try {
if (useSSA) {
var managedFields = resource.getMetadata().getManagedFields();
try {
resource.getMetadata().setManagedFields(null);
var res = resource(resource);
return res.subresource("status").patch(new PatchContext.Builder()
.withFieldManager(fieldManager)
.withForce(true)
.withPatchType(PatchType.SERVER_SIDE_APPLY)
.build());
} finally {
resource.getMetadata().setManagedFields(managedFields);
}
} else {
var res = resource(originalResource);
return res.editStatus(r -> resource);
}
} finally {
// restore initial resource version
originalResource.getMetadata().setResourceVersion(resourceVersion);

Read resource.getMetadata().getResourceVersion():

MDC.put(RESOURCE_VERSION, resource.getMetadata().getResourceVersion());

@matteriben
Copy link
Contributor Author

matteriben commented Feb 22, 2025

I noticed another occasional failure which seems to be related. This one seems to be causing my tests to actually fail. I tested with the change in PR #2695 to prevent the NPE. With that change both the NPE & ISE went away and my tests passed consistently.

Exception in thread "pool-48-thread-30" java.lang.IllegalStateException: Timer already cancelled.
	at java.base/java.util.Timer.sched(Timer.java:409)
	at java.base/java.util.Timer.schedule(Timer.java:205)
	at io.javaoperatorsdk.operator.processing.event.source.timer.TimerEventSource.scheduleOnce(TimerEventSource.java:49)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.lambda$scheduleExecutionForMaxReconciliationInterval$3(EventProcessor.java:287)
	at java.base/java.util.Optional.ifPresent(Optional.java:178)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.scheduleExecutionForMaxReconciliationInterval(EventProcessor.java:281)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.lambda$reScheduleExecutionIfInstructed$2(EventProcessor.java:275)
	at java.base/java.util.Optional.ifPresentOrElse(Optional.java:198)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.reScheduleExecutionIfInstructed(EventProcessor.java:271)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor.eventProcessingFinished(EventProcessor.java:250)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:445)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

@csviri
Copy link
Collaborator

csviri commented Feb 22, 2025

Hi @matteriben , thx for reporting, will take a look soon.

@csviri csviri self-assigned this Feb 22, 2025
This was linked to pull requests Feb 22, 2025
@csviri
Copy link
Collaborator

csviri commented Feb 22, 2025

Hi @matteriben , I also added a PR, actually there is no need to remove resource version for SSA, basically that is what we don't want to do if user explicitly sets it. Will dig a bit more if we can also handle more elegantly the edit. Cloning the resource from the cache would solve this, but that is a costly operation.

@csviri
Copy link
Collaborator

csviri commented Feb 22, 2025

@matteriben pls check my PR, in case feel free to test it from that branch, that should fix the issue in general. In the and for non-ssa patch I ended up to cloning the resource, why this might have a slight performance overhead, in no circumstances should be change the resource in the cache, since that can lead to some issues like this reported.

This should cover the issue.

cc @metacosm @xstefank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants