-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix flaky test ClientSendsAGoAway #7224
Conversation
3753ade
to
63ba062
Compare
Probably need some more description about the fix |
test/goaway_test.go
Outdated
@@ -816,6 +818,7 @@ func (s) TestClientSendsAGoAway(t *testing.T) { | |||
} | |||
} | |||
}() | |||
cc.WaitForStateChange(ctx, connectivity.Connecting) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would be the max wait here? Timeout for all tests is 7m. Is there a chance it can breach that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Max timeout is as long context is not timed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the state to wait be connectivity.Ready
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depends where the race is happening.
If there's just a race where we call Close
before we even start creating the connection, then this is a problem with the test, and Connecting
should fix that.
If this is a race where a connection is established and doesn't get a GOAWAY written to it (due to internal state in the ClientConn
/addrConn
, then Connecting
will expose that race and we should fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@purnesh42H WaitForStateChange waits until the connectivity.State of ClientConn changes from sourceState or ctx expires. A true value is returned in former case and false in latter. So we are providing sourceState as Connecting and hence it will wait until state changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think race is detected when client conn is getting closed before we try to send goAway so believe just waiting for clientConn state change will fix this race condition.
test/goaway_test.go
Outdated
@@ -800,6 +800,7 @@ func (s) TestClientSendsAGoAway(t *testing.T) { | |||
for { | |||
f, err := ct.fr.ReadFrame() | |||
if err != nil { | |||
t.Logf("error reading frame: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to push to errCh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
test/goaway_test.go
Outdated
@@ -816,6 +818,7 @@ func (s) TestClientSendsAGoAway(t *testing.T) { | |||
} | |||
} | |||
}() | |||
cc.WaitForStateChange(ctx, connectivity.Connecting) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depends where the race is happening.
If there's just a race where we call Close
before we even start creating the connection, then this is a problem with the test, and Connecting
should fix that.
If this is a race where a connection is established and doesn't get a GOAWAY written to it (due to internal state in the ClientConn
/addrConn
, then Connecting
will expose that race and we should fix it.
From chat offline:
So I think the test is correct (though it can still be simplified), but there is a race in the code that prevents the GOAWAY from being sent in some cases. |
@dfawley updated the PR. |
Update: this is the code where we are hardcode closing the transport. Will update the PR with the change where we send goAway frame in this case as well. |
@dfawley As per our discussion offline, we won't be waiting for all the transport conn to close down. So this PR's change is ready to review as it just makes sure we are waiting for the clientconn to be READY. |
Fixes #7160
fix:
goaway_test/TestClientSendsAGoAway
didn't use to wait for channels to be ready and which is why sometimes it was not able to read the frame and eventually result in test failure. This PR change make sure we read the frame only when the channel's state becomes READY.RELEASE NOTES: n/a