CustomResource Controllers stop receiving updates after watch reconnect

Sometimes our controllers stop receiving updates about their custom resources. We've seen that sometimes a watch will become disconnected, and the custom resource event source will try to reconnect but fail to do so. We think that this reconnect failure is causing this problem.

I've traced the reconnection and exceptions to here : https://github.com/java-operator-sdk/java-operator-sdk/blob/master/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/event/internal/CustomResourceEventSource.java#L157 . I believe that the `registerWatch` method is throwing an exception which isn't caught by the SDK. because the exception is not caught by the SDK the watch stays dead and the event source no longer sends events. See my logs here where I've isolated the failure I'm describing : https://gist.github.com/secondsun/8e31d2680ff689750c62ad6ce9f419c0

To work around this for now I am trying to use reflection and CDI schedulers to get a reference to the customresourceeventsource and call "onClose" with a subclassed watchexception (https://github.com/secondsun/app-services-operator/commit/a25c4bd48fb6ad434cbf0ee4213fa3c1b070a61b#diff-7a83aac91ab02c7b354a2f40496c5d38ca1c88d3f72a391e83b8e265eb341454R69). 

Clearly this is not an ideal solution, and I'm looking for workarounds or what the best fix at the SDK level could be. 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CustomResource Controllers stop receiving updates after watch reconnect #395

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CustomResource Controllers stop receiving updates after watch reconnect #395

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions