-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pickfirst: Implement Happy Eyeballs #7725
base: master
Are you sure you want to change the base?
Conversation
7cb88fe
to
db0dda7
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7725 +/- ##
==========================================
- Coverage 82.00% 81.93% -0.08%
==========================================
Files 373 374 +1
Lines 37735 37872 +137
==========================================
+ Hits 30945 31030 +85
- Misses 5512 5556 +44
- Partials 1278 1286 +8
|
Should we mention the environment variables in the release note? Or at least in the PR description? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't complete a full pass, but some comment here to get satrted.
Updated the release notes. |
7f3065d
to
67f7a1a
Compare
af38951
to
9712ec5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some minor nits in the tests.
// Replace the timer channel so that the old timers don't attempt to read | ||
// messages pushed next. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old timers should get canceled when subsequent subchannels are created, right? Why do we need to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required since pickfirst
will stop the timer, but the fake TimeAfterFunc will still keep waiting on the timer channel till the context is cancelled. If there are multiple listeners on the timer channel, they will race to read from the channel.
This could be avoided by introducing an interface for a time.Timer
so that the test can intercept calls to Timer.Stop()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you are saying. That seems better to me, unless it is too much work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to have the internal.TimeAfterFunc return a cancelFunc()
instead of a timer. This allowed the test to stop the timer when pickfirst
cancels the timer. I also created a helper function to return a timer function and a function to trigger the timer manually instead of having the tests write on channel.
testutils.AwaitNotState(shortCtx, t, cc, connectivity.TransientFailure) | ||
|
||
// Third SubConn fails. | ||
shortCancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? Won't testutils.AwaitNotState
fail the test if the specified state is reached before the context expires?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not required because of the way testutils.AwaitNotState
works. When I tried to ignore the first cancel function as follows:
shortCtx, _ := context.WithTimeout(ctx, defaultTestShortTimeout)
govet
complains about a possible context leak because it can't ensure that the context will be cancelled at compile time. If we re-assign the cancel
func later, govet
doesn't complain but I still called cancel
just to be consistent. Removed the call now.
// The happy eyeballs timer expires, skipping server[1] and requesting the creation | ||
// of a third SubConn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you say we are skipping server[1] here? IIUC correctly:
- we first started a connection to server[0]
- connection to server[0] failed before the HE timer fired
- so, we started a connection to server[1]
- now, the HE timer has fired
- so, we would start a connection to server[2]
I don't see where we are skipping server[1].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test doesn't skip the server but it skips waiting for the SubConn to report a success or failure and moves on to the next SubConn. The comment was copied taken from Java's test case. I've improved the wording now.
// The SubConn is being re-used and failed during a previous pass | ||
// over the addressList. It has not completed backoff yet. | ||
// Mark it as having failed and try the next address. | ||
scd.connectionFailed = true | ||
lastErr = scd.lastErr | ||
continue | ||
case connectivity.Ready: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: merge these into a default
and then just log the state and say it was unexpected and return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged.
@@ -580,9 +643,17 @@ func (b *pickfirstBalancer) updateSubConnState(sd *scData, newState balancer.Sub | |||
} | |||
} | |||
|
|||
func (b *pickfirstBalancer) endFirstPassLocked(lastErr error) { | |||
func (b *pickfirstBalancer) endFirstPassIfPossibleLocked(lastErr error) { | |||
if b.addressList.isValid() || b.subConns.Len() < b.addressList.size() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isValid
feels a lot like firstPass
. And shouldn't b.subConns.Len()
always be equal to the addressList
's index, so if addressList.isValid
then the second case should always be true, too? I'm sensing some duplication here that might be able to be removed & simplified. Or, I'm not understanding something and we need some comments to explain why these checks are here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isValid feels a lot like firstPass.
For the first pass to be completed, all the SubConns need to report a TF. !isValid()
will indicate that SubConn.Connect()
has been called on all the SubConns, but it doesn't tell us if TFs have been reported. So the isValid()
check is used as an optimization to return early and avoid iterating on the entire SubConns map.
And shouldn't b.subConns.Len() always be equal to the addressList's index, so if addressList.isValid then the second case should always be true, too?
Yes, the check for b.subConns.Len() < b.addressList.size()
seems redundant because if !b.addressList.isValid()
, then a SubConn would have been created for every address while iterating over the list. Removed the check for b.subConns.Len() < b.addressList.size()
.
Added some comments explaining the check and what the function does.
// The SubConn is being re-used and failed during a previous pass | ||
// over the addressList. It has not completed backoff yet. | ||
// Mark it as having failed and try the next address. | ||
scd.connectionFailed = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connectionFailed
is a bit like lastErr != nil
. Do we need both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lastErr
is used to update the picker at the end of the first pass. In the case where the last address in the list hasn't completed it's backoff from a previous attempt, scd.lastErr
would store a non-nil error. This is why scd.lastErr
is not reset when starting the first pass over a new address list.
scd.connectionFailed
indicates if the subchannel has failed with the latest address list from the resolver. It is reset before staring the first pass.
Consider a subchannel is being re-used after getting a resolver update because it's address is present in the new address list. The subchannel has already failed, it has scd.lastErr
set and scd.connectionFailed
set to true
. When the first pass starts, scd.connectionFailed
is set to false
.
- If the subchannel completes backoff when the iteration over the address list reaches it, the subchannel will be connected since it's state is IDLE. When it fails again,
scd.connectionFailed
will be set totrue
andscd.lastErr
will be updated. - If the subchannel is in backoff when the iteration over the address list reaches it, the subchannel will not be re-tried.
scd.lastErr
will be retained andscd.connectionFailed
will be set totrue
.
The above steps ensure that the subchannel always has a non-nil error to update the picker.
// Wait for the SubConn to report success or failure. | ||
// Wait for the connection attempt to complete or the timer to fire | ||
// before attempting the next address. | ||
b.scheduleNextConnectionLocked() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like either this or the one in the Idle
case is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A subchannel can be CONNECTING
from a previous resolver update. We can call still Connect
on it, but it will unnecessarily log a debug statement connect called on addrConn in non-idle state (%v); ignoring.
.
Both these cases are mentioned in A61.
b.firstPass = false | ||
b.numTF = 0 | ||
b.state = connectivity.TransientFailure | ||
|
||
b.cc.UpdateState(balancer.State{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we really supposed to attempt re-connection on every already-idle subchannel at the same instant? And not apply the timer or anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is mentioned in A61:
If the first pass completes without a successful connection attempt, we will switch to a mode where we keep trying to connect to all addresses at all times, with no regard for the order of the addresses.
internal/envconfig/envconfig.go
Outdated
@@ -43,8 +43,7 @@ var ( | |||
// EnforceALPNEnabled is set if TLS connections to servers with ALPN disabled | |||
// should be rejected. The HTTP/2 protocol requires ALPN to be enabled, this | |||
// option is present for backward compatibility. This option may be overridden | |||
// by setting the environment variable "GRPC_ENFORCE_ALPN_ENABLED" to "true" | |||
// or "false". | |||
// by setting the environment variable "GRPC_ENFORCE_ALPN_ENABLED" to "false". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect you didn't want these diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted. I initially added an env var to toggle happy eyeballs, which was later removed it. I made a similar change to the doc comment for new env var based on reviewer feedback.
As part of the Dualstack design, the pickfirst policy should implement the happy eyeballs algorithm while connecting to multiple backends.
The timeout for the happy eyeballs connection timer is NOT configurable as that's an optional requirement in the gRFC.
RELEASE NOTES:
pickfirst
LB policy (disabled by default) supports Happy Eyeballs to attempt connections to multiple backends concurrently. The experimentalpickfirst
policy can be enabled by setting the environment variableGRPC_EXPERIMENTAL_ENABLE_NEW_PICK_FIRST
totrue
.