-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds/server: Fix nil panic in xDS Server when received LDS with no inline RDS #6747
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes LGTM except for the stray comment, but I think you said we'll need to patch this on the release branch, but it's different from main.
More importantly: for this PR, can you write a test that reproduces this, confirms this fixes it, and helps us write the fix on the release branch?
Thanks!
@@ -194,7 +194,7 @@ func (s) TestListenerWrapper_InlineRouteConfig(t *testing.T) { | |||
// resource. The test verifies that the listenerWrapper does not become ready | |||
// when waiting for the Route Configuration resource and becomes ready once it | |||
// receives the Route Configuration resource. | |||
func (s) TestListenerWrapper_RouteNames(t *testing.T) { | |||
func (s) TestListenerWrapper_RouteNames(t *testing.T) { // This is what I need, LDS + RDS and then accept a conn that looks up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, done.
Added a test which triggers nil panic on master, and the fix fixes it. Will work on release branches now. |
Added a test which triggers it on master and verifies fix. |
t.Fatal(err) | ||
} | ||
serving := grpcsync.NewEvent() | ||
modeChangeOpt := xds.ServingModeCallback(func(addr net.Addr, args xds.ServingModeChangeArgs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? If so, why? Once the listener is created, I think it should be safe to call Dial
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment.
t.Fatal(err) | ||
} | ||
serving := grpcsync.NewEvent() | ||
modeChangeOpt := xds.ServingModeCallback(func(addr net.Addr, args xds.ServingModeChangeArgs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments as we discussed offline
@@ -331,15 +331,19 @@ func (l *listenerWrapper) handleRDSUpdate(update rdsHandlerUpdate) { | |||
} | |||
if update.err != nil { | |||
if xdsresource.ErrType(update.err) == xdsresource.ErrorTypeResourceNotFound { | |||
l.switchMode(nil, connectivity.ServingModeNotServing, update.err) | |||
l.mu.Lock() | |||
l.filterChains = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong; we should not throw away the LDS data when we get a bad RDS. A good RDS needs to be able to work again.
l.mu.Lock() | ||
l.filterChains = nil | ||
l.switchModeLocked(connectivity.ServingModeNotServing, update.err) | ||
l.mu.Unlock() | ||
} | ||
// For errors which are anything other than "resource-not-found", we | ||
// continue to use the old configuration. | ||
return | ||
} | ||
atomic.StorePointer(&l.rdsUpdates, unsafe.Pointer(&update.updates)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to verify somewhere that we are still watching this RDS update. Probably at the top?
l.switchMode(nil, connectivity.ServingModeNotServing, fmt.Errorf("address (%s:%s) in Listener update does not match listening address: (%s:%s)", ilc.Address, ilc.Port, l.addr, l.port)) | ||
l.mu.Lock() | ||
l.filterChains = nil | ||
l.switchModeLocked(connectivity.ServingModeNotServing, fmt.Errorf("address (%s:%s) in Listener update does not match listening address: (%s:%s)", ilc.Address, ilc.Port, l.addr, l.port)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs to stop the RDS watch if there was one, such that subsequent RDS updates are ignored.
I'll just close this since I think there's even more to it than we were originally thinking. |
This PR fixes a nil panic in Listener Wrapper, from a nil receiver described in issue #6683.
RELEASE NOTES: