-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sdk#994] Heal deadline #995
[sdk#994] Heal deadline #995
Conversation
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
aac442a
to
d125386
Compare
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
438765f
to
ab3248e
Compare
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
ab3248e
to
6ae8df2
Compare
@@ -108,8 +108,6 @@ func (l *interposeServer) Request(ctx context.Context, request *networkservice.N | |||
return result, nil | |||
} | |||
|
|||
l.activeConnection.Delete(activeConnID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When heal fails its first Request, it cleans up active connection - it results in Close failing with no active connection found
because for interpose
it looks like trying to close not existing connection. heal
stands in chain after interpose
, so client cannot stop healing in such case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... As I can see with this fix activeConnection map will not delete connections.
Am I get it correctly that we got a leak of active connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, missed this.
Fixed in other way with setting clientURL = nil
an invoking next.Close
in Close in case of no active connection exists.
nseCtxCancel() | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add sleep here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test validates that heal can be stopped after it has been started - so we need to wait for some time to make sure that heal is really running at the moment when we are trying to stop it.
Without this sleep test is also working, but it tests preventive heal stop - and so it is not the thing that should be tested here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some other way we can make sure that heal has started?
Recently we found out that on CI 100ms sometimes isn't enough for thread synchronization. And the test where we had issues with timeout is much simpler, so complex heal tests likely require even more time to be sure that another thread started working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be problematic.
Waiting for 100ms
doesn't guarantee that heal has been started, but it does mean that in most test runs it will be so.
Signed-off-by: Vladimir Popov <vladimir.popov@xored.com>
Description
Issue link
Closes #994.
How Has This Been Tested?
Added unit testing to coverCovered by existing unit testingTypes of changes