Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickfirst: Implement Happy Eyeballs #7725

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

arjan-bal
Copy link
Contributor

@arjan-bal arjan-bal commented Oct 10, 2024

As part of the Dualstack design, the pickfirst policy should implement the happy eyeballs algorithm while connecting to multiple backends.

The timeout for the happy eyeballs connection timer is NOT configurable as that's an optional requirement in the gRFC.

RELEASE NOTES:

  • The new experimental pickfirst LB policy (disabled by default) supports Happy Eyeballs to attempt connections to multiple backends concurrently. The experimental pickfirst policy can be enabled by setting the environment variable GRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST to true.

@arjan-bal arjan-bal added the Type: Feature New features or improvements in behavior label Oct 10, 2024
@arjan-bal arjan-bal added this to the 1.68 Release milestone Oct 10, 2024
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 86.81319% with 12 lines in your changes missing coverage. Please review.

Project coverage is 81.93%. Comparing base (18d218d) to head (11fe515).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 86.36% 9 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7725      +/-   ##
==========================================
- Coverage   82.00%   81.93%   -0.08%     
==========================================
  Files         373      374       +1     
  Lines       37735    37872     +137     
==========================================
+ Hits        30945    31030      +85     
- Misses       5512     5556      +44     
- Partials     1278     1286       +8     
Files with missing lines Coverage Δ
balancer/pickfirst/internal/internal.go 100.00% <100.00%> (ø)
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 89.29% <86.36%> (+0.51%) ⬆️

... and 29 files with indirect coverage changes

@easwars
Copy link
Contributor

easwars commented Oct 10, 2024

Should we mention the environment variables in the release note? Or at least in the PR description?

Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't complete a full pass, but some comment here to get satrted.

balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
internal/envconfig/envconfig.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
@easwars easwars assigned arjan-bal and unassigned easwars Oct 10, 2024
@arjan-bal
Copy link
Contributor Author

Should we mention the environment variables in the release note? Or at least in the PR description?

Updated the release notes.

@arjan-bal arjan-bal assigned easwars and unassigned arjan-bal Oct 11, 2024
@purnesh42H purnesh42H modified the milestones: 1.68 Release, 1.69 Release Oct 16, 2024
@easwars easwars assigned arjan-bal and unassigned easwars Oct 17, 2024
@arjan-bal arjan-bal assigned easwars and dfawley and unassigned arjan-bal Oct 18, 2024
Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor nits in the tests.

balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf_ext_test.go Outdated Show resolved Hide resolved
Comment on lines 1060 to 1061
// Replace the timer channel so that the old timers don't attempt to read
// messages pushed next.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old timers should get canceled when subsequent subchannels are created, right? Why do we need to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required since pickfirst will stop the timer, but the fake TimeAfterFunc will still keep waiting on the timer channel till the context is cancelled. If there are multiple listeners on the timer channel, they will race to read from the channel.

This could be avoided by introducing an interface for a time.Timer so that the test can intercept calls to Timer.Stop().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying. That seems better to me, unless it is too much work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to have the internal.TimeAfterFunc return a cancelFunc() instead of a timer. This allowed the test to stop the timer when pickfirst cancels the timer. I also created a helper function to return a timer function and a function to trigger the timer manually instead of having the tests write on channel.

@easwars easwars assigned arjan-bal and unassigned easwars Oct 22, 2024
@arjan-bal arjan-bal removed their assignment Oct 23, 2024
@arjan-bal arjan-bal removed their assignment Oct 23, 2024
testutils.AwaitNotState(shortCtx, t, cc, connectivity.TransientFailure)

// Third SubConn fails.
shortCancel()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? Won't testutils.AwaitNotState fail the test if the specified state is reached before the context expires?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not required because of the way testutils.AwaitNotState works. When I tried to ignore the first cancel function as follows:

shortCtx, _ := context.WithTimeout(ctx, defaultTestShortTimeout)

govet complains about a possible context leak because it can't ensure that the context will be cancelled at compile time. If we re-assign the cancel func later, govet doesn't complain but I still called cancel just to be consistent. Removed the call now.

Comment on lines 1024 to 1025
// The happy eyeballs timer expires, skipping server[1] and requesting the creation
// of a third SubConn.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you say we are skipping server[1] here? IIUC correctly:

  • we first started a connection to server[0]
  • connection to server[0] failed before the HE timer fired
  • so, we started a connection to server[1]
  • now, the HE timer has fired
  • so, we would start a connection to server[2]

I don't see where we are skipping server[1].

Copy link
Contributor Author

@arjan-bal arjan-bal Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't skip the server but it skips waiting for the SubConn to report a success or failure and moves on to the next SubConn. The comment was copied taken from Java's test case. I've improved the wording now.

@arjan-bal arjan-bal added the Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. label Nov 7, 2024
// The SubConn is being re-used and failed during a previous pass
// over the addressList. It has not completed backoff yet.
// Mark it as having failed and try the next address.
scd.connectionFailed = true
lastErr = scd.lastErr
continue
case connectivity.Ready:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: merge these into a default and then just log the state and say it was unexpected and return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged.

@@ -580,9 +643,17 @@ func (b *pickfirstBalancer) updateSubConnState(sd *scData, newState balancer.Sub
}
}

func (b *pickfirstBalancer) endFirstPassLocked(lastErr error) {
func (b *pickfirstBalancer) endFirstPassIfPossibleLocked(lastErr error) {
if b.addressList.isValid() || b.subConns.Len() < b.addressList.size() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isValid feels a lot like firstPass. And shouldn't b.subConns.Len() always be equal to the addressList's index, so if addressList.isValid then the second case should always be true, too? I'm sensing some duplication here that might be able to be removed & simplified. Or, I'm not understanding something and we need some comments to explain why these checks are here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isValid feels a lot like firstPass.

For the first pass to be completed, all the SubConns need to report a TF. !isValid() will indicate that SubConn.Connect() has been called on all the SubConns, but it doesn't tell us if TFs have been reported. So the isValid() check is used as an optimization to return early and avoid iterating on the entire SubConns map.

And shouldn't b.subConns.Len() always be equal to the addressList's index, so if addressList.isValid then the second case should always be true, too?

Yes, the check for b.subConns.Len() < b.addressList.size() seems redundant because if !b.addressList.isValid(), then a SubConn would have been created for every address while iterating over the list. Removed the check for b.subConns.Len() < b.addressList.size().

Added some comments explaining the check and what the function does.

// The SubConn is being re-used and failed during a previous pass
// over the addressList. It has not completed backoff yet.
// Mark it as having failed and try the next address.
scd.connectionFailed = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connectionFailed is a bit like lastErr != nil. Do we need both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastErr is used to update the picker at the end of the first pass. In the case where the last address in the list hasn't completed it's backoff from a previous attempt, scd.lastErr would store a non-nil error. This is why scd.lastErr is not reset when starting the first pass over a new address list.

scd.connectionFailed indicates if the subchannel has failed with the latest address list from the resolver. It is reset before staring the first pass.

Consider a subchannel is being re-used after getting a resolver update because it's address is present in the new address list. The subchannel has already failed, it has scd.lastErr set and scd.connectionFailed set to true. When the first pass starts, scd.connectionFailed is set to false.

  • If the subchannel completes backoff when the iteration over the address list reaches it, the subchannel will be connected since it's state is IDLE. When it fails again, scd.connectionFailed will be set to true and scd.lastErr will be updated.
  • If the subchannel is in backoff when the iteration over the address list reaches it, the subchannel will not be re-tried. scd.lastErr will be retained and scd.connectionFailed will be set to true.

The above steps ensure that the subchannel always has a non-nil error to update the picker.

// Wait for the SubConn to report success or failure.
// Wait for the connection attempt to complete or the timer to fire
// before attempting the next address.
b.scheduleNextConnectionLocked()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like either this or the one in the Idle case is redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A subchannel can be CONNECTING from a previous resolver update. We can call still Connect on it, but it will unnecessarily log a debug statement connect called on addrConn in non-idle state (%v); ignoring..

Both these cases are mentioned in A61.
image

b.firstPass = false
b.numTF = 0
b.state = connectivity.TransientFailure

b.cc.UpdateState(balancer.State{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really supposed to attempt re-connection on every already-idle subchannel at the same instant? And not apply the timer or anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is mentioned in A61:

If the first pass completes without a successful connection attempt, we will switch to a mode where we keep trying to connect to all addresses at all times, with no regard for the order of the addresses.

@@ -43,8 +43,7 @@ var (
// EnforceALPNEnabled is set if TLS connections to servers with ALPN disabled
// should be rejected. The HTTP/2 protocol requires ALPN to be enabled, this
// option is present for backward compatibility. This option may be overridden
// by setting the environment variable "GRPC_ENFORCE_ALPN_ENABLED" to "true"
// or "false".
// by setting the environment variable "GRPC_ENFORCE_ALPN_ENABLED" to "false".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you didn't want these diffs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted. I initially added an env var to toggle happy eyeballs, which was later removed it. I made a similar change to the doc comment for new env var based on reviewer feedback.

@dfawley dfawley assigned arjan-bal and unassigned dfawley Nov 7, 2024
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants