Proper resource cleanup for expired connections #1445

BenAgai · 2023-04-02T09:19:05Z

Hi,
I wanted to ask a question regarding cleanup of expired connections made from an NSC to NSE.

Assuming NSC successfully established connection to an NSE.
If the NSC runs on a pod that got terminated without the NSC initiating close flow, we will have leak across the data path for that connection.

Is there any built in mechanism the recover from such scenario? (for example expiration on a connection that is renewed every time refresh occurs, and once the connection is expired a close flow is initiated for it).

I encountered the following links that looks relevant, but wasn’t sure:
networkservicemesh/deployments-k8s#8882
#1439
#1440

I will be glad to know if the issue is resolved and how, or whether should I implement such mechanism.
Thanks!

glazychev-art · 2023-04-28T06:45:12Z

Hi @BenAgai,
That's right, we have such functionality.
There is timeout chain element that will call Close for inactive connections. This happens after the token expires (default 10 minutes).
If the connection is alive, then refreshes occur periodically, and this doesn't allow the connection to expire.
If the NSC didn't initiate Close when the pod ended, this Close will occur when the timeout occurs.

BenAgai · 2023-06-08T07:22:16Z

Hi @glazychev-art ,
Thank you very much for the update!

or-adar · 2023-07-16T16:37:32Z

@glazychev-art I'm reusing this thread as I have some questions about the timeout chain element which are unclear to me:
if I don't want to use the default 10 minutes and want to use a lower value, like 1 minue - what should be considered?
I mean, if the timeout element only considers the expiration time of it's previous path segment - should I align the env of NSM_MAX_TOKEN_LIFETIME for all the components in the path (NSC, NSMgr, forwarders, passthrough, NSEs), so the new value is same for all,
or are there particular components that should be enough to only set their values (while keeping the default value for others)?

glazychev-art · 2023-07-17T07:46:07Z

@or-adar
You are right, currently the timeout element only considers it's previous path segment.
If the timeout calls Close - this call goes though all components.

So, for example, if you set NSM_MAX_TOKEN_LIFETIME=1m only for NSC, then NSMgr calls Close after 1m (if refreshes do not take place regularly). Other components (forwarders, passthrough, NSEs) will also receive this Close from the NSMgr, but not from its timeout element (because they use NSM_MAX_TOKEN_LIFETIME=10m).

For the normal case, this is sufficient. Some questions may arise in the case of healing.
For example, NSC and its NSMgr die. In this case other components will clean up their resources only on their own timeouts (i.e after 10m). Because the only one (NSMgr) who was responsible for clearing resources after 1 minute died.

or-adar · 2023-07-17T12:14:36Z

@glazychev-art appreciate the response!
so if I got you right - if I want NSEs & Forwarders to clean up their resources after 1m, it is more recommended to set the NSM_MAX_TOKEN_LIFETIME to 1m for all components that might be included in the PathSegment (NSC, NSMgr, forwarders, passthrough, NSEs), to accomodate for cases where components die?

glazychev-art · 2023-07-17T12:49:27Z

Yes, that's right

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper resource cleanup for expired connections #1445

Proper resource cleanup for expired connections #1445

BenAgai commented Apr 2, 2023

glazychev-art commented Apr 28, 2023

BenAgai commented Jun 8, 2023

or-adar commented Jul 16, 2023 •

edited

Loading

glazychev-art commented Jul 17, 2023

or-adar commented Jul 17, 2023

glazychev-art commented Jul 17, 2023

Proper resource cleanup for expired connections #1445

Proper resource cleanup for expired connections #1445

Comments

BenAgai commented Apr 2, 2023

glazychev-art commented Apr 28, 2023

BenAgai commented Jun 8, 2023

or-adar commented Jul 16, 2023 • edited Loading

glazychev-art commented Jul 17, 2023

or-adar commented Jul 17, 2023

glazychev-art commented Jul 17, 2023

or-adar commented Jul 16, 2023 •

edited

Loading