fix(watcher): reconnect after server or client timeout #780

buehler · 2024-06-20T09:21:54Z

This fixes #739.
This closes #771.

Allows the resource watcher to retry the connection until the
cancellation token requests a stop. The watcher caches the
received entities and checks for their keys in a concurrent
dictionary. Recurring "added" events after reconnection
should not trigger a reconciliation.

buehler · 2024-06-20T09:28:54Z

@duke-bartholomew what do you think about these changes? This should fix #739. Further it has a breaking change, but one that is pretty open. The change should not affect many people.

To deduplicate the events that are generated while reconnecting, the entity cache in the watcher should do the trick.

So, theoretically the breaking change would not be required.

buehler · 2024-06-20T09:31:46Z

I think I'm going to remove the breaking change, because right now there is no real value in it.

duke-bartholomew · 2024-07-05T09:42:40Z

@duke-bartholomew what do you think about these changes? This should fix #739. Further it has a breaking change, but one that is pretty open. The change should not affect many people.

To deduplicate the events that are generated while reconnecting, the entity cache in the watcher should do the trick.

So, theoretically the breaking change would not be required.

@buehler sorry for my late reply ..
👍 looks good

I would personally also love to have the ability to get a ResourceVersion from the List actions ...
This would give me the ability to first List resources, build up my local state for all of them, then trigger any side-effects on the total state and then start watches to process changes in "real-time" and trigger further side-effects as they happen.
To do this I currently use the underlying IKubernetes client directly, but this is then bypassing your KubernetesClient abstraction and re-implementing a big part of what you already have present in your KubernetsClient, which feels a bit like a shame.
But anyway this is not related anymore to the original issue of the Watcher dying.

buehler added 2 commits June 20, 2024 11:22

fix(watcher): reconnect after server or client timeout

c0e679d

fix the tests

441f994

buehler force-pushed the fix/resource-watcher-fails-to-retry branch from 4d3db82 to 441f994 Compare June 20, 2024 09:25

remove the breaking change

7521da3

buehler marked this pull request as ready for review June 20, 2024 10:41

buehler merged commit aa073d1 into main Jun 20, 2024
3 checks passed

buehler deleted the fix/resource-watcher-fails-to-retry branch June 20, 2024 11:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(watcher): reconnect after server or client timeout #780

fix(watcher): reconnect after server or client timeout #780

buehler commented Jun 20, 2024 •

edited

Loading

buehler commented Jun 20, 2024

buehler commented Jun 20, 2024

duke-bartholomew commented Jul 5, 2024

fix(watcher): reconnect after server or client timeout #780

fix(watcher): reconnect after server or client timeout #780

Conversation

buehler commented Jun 20, 2024 • edited Loading

buehler commented Jun 20, 2024

buehler commented Jun 20, 2024

duke-bartholomew commented Jul 5, 2024

buehler commented Jun 20, 2024 •

edited

Loading