Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds/server: Fix nil panic in xDS Server when received LDS with no inline RDS #6747

Closed
wants to merge 5 commits into from

Conversation

zasweq
Copy link
Contributor

@zasweq zasweq commented Oct 24, 2023

This PR fixes a nil panic in Listener Wrapper, from a nil receiver described in issue #6683.

RELEASE NOTES:

  • xds/server: Fix nil panic in xDS Server when received LDS with no inline RDS

@zasweq zasweq requested a review from dfawley October 24, 2023 23:43
@zasweq zasweq added this to the 1.60 Release milestone Oct 24, 2023
@codecov
Copy link

codecov bot commented Oct 24, 2023

Codecov Report

Merging #6747 (16d7f91) into master (6e14274) will increase coverage by 0.19%.
Report is 6 commits behind head on master.
The diff coverage is 63.63%.

Additional details and impacted files

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM except for the stray comment, but I think you said we'll need to patch this on the release branch, but it's different from main.

More importantly: for this PR, can you write a test that reproduces this, confirms this fixes it, and helps us write the fix on the release branch?

Thanks!

@@ -194,7 +194,7 @@ func (s) TestListenerWrapper_InlineRouteConfig(t *testing.T) {
// resource. The test verifies that the listenerWrapper does not become ready
// when waiting for the Route Configuration resource and becomes ready once it
// receives the Route Configuration resource.
func (s) TestListenerWrapper_RouteNames(t *testing.T) {
func (s) TestListenerWrapper_RouteNames(t *testing.T) { // This is what I need, LDS + RDS and then accept a conn that looks up
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, done.

@dfawley dfawley assigned zasweq and unassigned dfawley Oct 25, 2023
@zasweq
Copy link
Contributor Author

zasweq commented Oct 31, 2023

Added a test which triggers nil panic on master, and the fix fixes it. Will work on release branches now.

@zasweq
Copy link
Contributor Author

zasweq commented Oct 31, 2023

Added a test which triggers it on master and verifies fix.

@ginayeh ginayeh assigned dfawley and unassigned zasweq Oct 31, 2023
t.Fatal(err)
}
serving := grpcsync.NewEvent()
modeChangeOpt := xds.ServingModeCallback(func(addr net.Addr, args xds.ServingModeChangeArgs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? If so, why? Once the listener is created, I think it should be safe to call Dial?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment as mentioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

@zasweq zasweq changed the title xds/internal/server: Fix nil panic in Listener Wrapper xds/server: Fix nil panic in xDS Server when received LDS with no inline RDS Oct 31, 2023
t.Fatal(err)
}
serving := grpcsync.NewEvent()
modeChangeOpt := xds.ServingModeCallback(func(addr net.Addr, args xds.ServingModeChangeArgs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment as mentioned.

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments as we discussed offline

@@ -331,15 +331,19 @@ func (l *listenerWrapper) handleRDSUpdate(update rdsHandlerUpdate) {
}
if update.err != nil {
if xdsresource.ErrType(update.err) == xdsresource.ErrorTypeResourceNotFound {
l.switchMode(nil, connectivity.ServingModeNotServing, update.err)
l.mu.Lock()
l.filterChains = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong; we should not throw away the LDS data when we get a bad RDS. A good RDS needs to be able to work again.

l.mu.Lock()
l.filterChains = nil
l.switchModeLocked(connectivity.ServingModeNotServing, update.err)
l.mu.Unlock()
}
// For errors which are anything other than "resource-not-found", we
// continue to use the old configuration.
return
}
atomic.StorePointer(&l.rdsUpdates, unsafe.Pointer(&update.updates))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify somewhere that we are still watching this RDS update. Probably at the top?

l.switchMode(nil, connectivity.ServingModeNotServing, fmt.Errorf("address (%s:%s) in Listener update does not match listening address: (%s:%s)", ilc.Address, ilc.Port, l.addr, l.port))
l.mu.Lock()
l.filterChains = nil
l.switchModeLocked(connectivity.ServingModeNotServing, fmt.Errorf("address (%s:%s) in Listener update does not match listening address: (%s:%s)", ilc.Address, ilc.Port, l.addr, l.port))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to stop the RDS watch if there was one, such that subsequent RDS updates are ignored.

@dfawley
Copy link
Member

dfawley commented Nov 3, 2023

I'll just close this since I think there's even more to it than we were originally thinking.

@dfawley dfawley closed this Nov 3, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants