-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: Fix flaky test Test/ServerSideXDS_WithValidAndInvalidSecurityConfiguration #7411
xds: Fix flaky test Test/ServerSideXDS_WithValidAndInvalidSecurityConfiguration #7411
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #7411 +/- ##
==========================================
+ Coverage 81.42% 81.48% +0.06%
==========================================
Files 348 350 +2
Lines 26744 26846 +102
==========================================
+ Hits 21775 21875 +100
- Misses 3779 3783 +4
+ Partials 1190 1188 -2 |
7b23548
to
f681bd7
Compare
f681bd7
to
7360d9c
Compare
It looks the ADS rpc handler on the go-control-plane is essentially processing one request (which includes ACK/NACK as well) at a time, serially. And it calls the |
In the logs of failing runs for
TestServerSideXDS_WithValidAndInvalidSecurityConfiguration
, we see that the resource snapshot update request is sent to the xds management server before the xds client is able to connect to it. Every time a gRPC server receives the updated snapshot resource from the xds server, it sends a NACK as the configuration is invalid. When the server receives more than one NACK, it gets stuck while writing to a buffered channel here:grpc-go/test/xds/xds_server_certificate_providers_test.go
Line 249 in d27ddb5
This results in the gRPC server timing out while getting the new configuration.
This change ensures that the server writes to the buffered channel at most once and doesn't get stuck.
Example failure: https://github.com/grpc/grpc-go/actions/runs/9790522733/job/27032314481?pr=7390
Tested
Verified that the test no longer fails in 100000 runs
Updates: #6914
RELEASE NOTES: None