Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(watcher): reconnect after server or client timeout #780

Merged
merged 3 commits into from
Jun 20, 2024

Conversation

buehler
Copy link
Owner

@buehler buehler commented Jun 20, 2024

This fixes #739.
This closes #771.

Allows the resource watcher to retry the connection until the
cancellation token requests a stop. The watcher caches the
received entities and checks for their keys in a concurrent
dictionary. Recurring "added" events after reconnection
should not trigger a reconciliation.

@buehler buehler force-pushed the fix/resource-watcher-fails-to-retry branch from 4d3db82 to 441f994 Compare June 20, 2024 09:25
@buehler
Copy link
Owner Author

buehler commented Jun 20, 2024

@duke-bartholomew what do you think about these changes? This should fix #739. Further it has a breaking change, but one that is pretty open. The change should not affect many people.

To deduplicate the events that are generated while reconnecting, the entity cache in the watcher should do the trick.

So, theoretically the breaking change would not be required.

@buehler
Copy link
Owner Author

buehler commented Jun 20, 2024

I think I'm going to remove the breaking change, because right now there is no real value in it.

@buehler buehler marked this pull request as ready for review June 20, 2024 10:41
@buehler buehler merged commit aa073d1 into main Jun 20, 2024
3 checks passed
@buehler buehler deleted the fix/resource-watcher-fails-to-retry branch June 20, 2024 11:05
@duke-bartholomew
Copy link

@duke-bartholomew what do you think about these changes? This should fix #739. Further it has a breaking change, but one that is pretty open. The change should not affect many people.

To deduplicate the events that are generated while reconnecting, the entity cache in the watcher should do the trick.

So, theoretically the breaking change would not be required.

@buehler sorry for my late reply ..
👍 looks good

I would personally also love to have the ability to get a ResourceVersion from the List actions ...
This would give me the ability to first List resources, build up my local state for all of them, then trigger any side-effects on the total state and then start watches to process changes in "real-time" and trigger further side-effects as they happen.
To do this I currently use the underlying IKubernetes client directly, but this is then bypassing your KubernetesClient abstraction and re-implementing a big part of what you already have present in your KubernetsClient, which feels a bit like a shame.
But anyway this is not related anymore to the original issue of the Watcher dying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug]: No retry is performed when the ResourceWatcher fail to watch a resource
2 participants