Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: Fail xDS Server Serve() if called after Stop() or GracefulStop() #6410

Merged
merged 3 commits into from
Jun 27, 2023

Conversation

zasweq
Copy link
Contributor

@zasweq zasweq commented Jun 23, 2023

Fixes #5838.

This PR fails Serve() if called after Stop() or GracefulStop(). This is important because Serve() eventually increments the global singleton xDS Client Ref, and thus if called without a corresponding cleanup, Stop() or GracefulStop(), the singleton xDS Client will eventually leak.

RELEASE NOTES: N/A

@zasweq zasweq requested a review from easwars June 23, 2023 23:58
@zasweq zasweq added this to the 1.57 Release milestone Jun 23, 2023
// Do this in a goroutine, even though should just receive error. This is to
// prevent blocking for (which comes after, so doesn't apply here), but
// mainly to wrap this err Receive operation with a testing timeout.
go func() { serveDone.Send(server.Serve(lis)) }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any value in doing this in a goroutine and plumbing the returned error value through a channel for the main test goroutine to inspect. I think all we need here is a direct call to server.Serve() and inspecting the error returned from the client.

If you want to test Stop racing with Serve, you should probably do something like this:

func (s) TestXxx(t *testing.T) {
  // Setup a real management server
  // Setup bootstrap file to point to the management server

  for i := 0 ; i < 100; i++ {
    // Start an xDS enabled server.
    // Spawn a goroutine which calls Serve
    // Spawn a goroutine which calls Stop/GracefulStop
  }

  // Wait for all the above goroutines to complete using a sync.WaitGroup

  // Rely on the leakchecker to detect if any goroutines are leaked (which will be the case if not all references to the real xdsClient being created are not released).
}

Copy link
Contributor Author

@zasweq zasweq Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I had this same thought. I had option 1 but wanted to wrap the operation with a timeout, since it could possibly hang if it doesn't hit the error and does the while true, and then we have to wait for the test to hit 7 minute timeout. I think the simplicity of just doing err := server.Serve beats the reason you need to wrap in a timeout. Hmmmmm ok I never thought about testing with the leak checker. Oh I didn't want to call Serve before Close in a test because I didn't want to send an LDS response so it can call into server.Server in grpc, but thinking about it you don't need a good LDS response, because it will receive from this quit here to exit: https://github.com/grpc/grpc-go/blob/master/xds/server.go#L270, and will already have refed the client conn up 1, which is the part that is interesting to test. Will switch to err := server.Serve and implement your test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

xds/server_test.go Show resolved Hide resolved
xds/server.go Show resolved Hide resolved
@easwars easwars assigned zasweq and unassigned easwars Jun 27, 2023
@zasweq zasweq assigned easwars and unassigned zasweq Jun 27, 2023
}

// TestServeAndCloseDoNotRace tests that Serve and Close on the xDS Server do
// not race and leak the xDS Client. A race would be found by the leak checker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the leak would be found by the leak checker and not the race :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahahaha fair. Switched.

@easwars easwars assigned zasweq and unassigned easwars Jun 27, 2023
@zasweq zasweq merged commit 575a936 into grpc:master Jun 27, 2023
1 check passed
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xds: server should fail Serve() if called after Close()
2 participants