-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grpc: support channel idleness #6263
Conversation
I will be adding e2e tests soon and will push a commit for the same. But the PR is ready to looked at. |
Looks like there is a race around closing and entering/exiting idle. I hit it with my first e2e test. Will ping the PR when this is ready to be looked at again. Sorry. |
41500ec
to
3649ed2
Compare
8ee9c19
to
7d21f4f
Compare
7d21f4f
to
a3771e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I'll be able to finish a pass today, but here are my thoughts on the idle manager so far...
idle.go
Outdated
) | ||
|
||
// For overriding in unit tests. | ||
var newTimer = func(d time.Duration, f func()) *time.Timer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: afterFunc
or timeAfterFunc
please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
idle.go
Outdated
|
||
enforcer idlenessEnforcer // Functionality provided by grpc.ClientConn. | ||
timeout int64 // Idle timeout duration nanos stored as an int64. | ||
isDisabled bool // Disabled if idle_timeout is set to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar, but how about returning a nil
*idlenessManager
instead and make the receivers short-circuit via nil
? Or don't call from cc if nil
. Or make this implement an interface and have a disabledIdlenessManager type that is returned instead. It seems unusual to have a field that indicates the whole struct shouldn't even be used, given that it doesn't change dynamically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Used the short-circuit via nil
option.
idle.go
Outdated
i.timer = newTimer(time.Duration(i.timeout), i.handleIdleTimeout) | ||
i.isIdle = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These things seem like they'd belong in exitIdleMode
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the switch to the atomic idleness manager, this is not applicable anymore.
0e5b271
to
6c1a3ea
Compare
balancer_conn_wrappers.go
Outdated
// Reset the current balancer name so that we act on the next call to | ||
// switchTo by creating a new balancer specified by the new resolver. | ||
ccb.curBalancerName = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to do this when entering idle instead of exiting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
clientconn.go
Outdated
if cc.dopts.idleTimeout == 0 { | ||
cc.idlenessMgr = newDisabledIdlenessManager() | ||
} else { | ||
cc.idlenessMgr = newAtomicIdlenessManager(cc, cc.dopts.idleTimeout) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify to cc.idlenessMgr = newIdlenessManager(cc, cc.dopts.idleTimeout)
and leave the complexity of the implementation up to the implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
call.go
Outdated
} | ||
return invoke(ctx, method, args, reply, cc, opts...) | ||
cc.idlenessMgr.onCallEnd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: defer
instead and leave the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
test/idleness_test.go
Outdated
sCtx, sCancel := context.WithTimeout(ctx, 3*defaultTestShortIdleTimeout) | ||
defer sCancel() | ||
go func() { | ||
for ; sCtx.Err() == nil; <-time.After(defaultTestShortTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe defaultTestShortIdleTimeout/2
for this instead? Otherwise these things that are otherwise not dependent on each other are actually dependent on each other for this test's correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
test/idleness_test.go
Outdated
// restarted when exiting idle, it will push the same address to grpc again. | ||
r := manual.NewBuilderWithScheme("whatever") | ||
backend := stubserver.StartTestService(t, nil) | ||
t.Cleanup(func() { backend.Stop() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional nit: t.Cleanup(backend.Stop)
It seems most of the other cleanups can't be simplified similarly (cc.Close
errors and things that take parameters have to b e wrapped). 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Thanks. I've gotten so used to using t.Cleanup(func() { ... })
that I didn't even stop to think about the signature of this method.
test/idleness_test.go
Outdated
sCtx, sCancel := context.WithTimeout(ctx, 3*defaultTestShortIdleTimeout) | ||
defer sCancel() | ||
if cc.WaitForStateChange(sCtx, connectivity.Ready) { | ||
t.Fatalf("Connectivity state changed to %q when expected to stay in READY", cc.GetState()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use these instead!
func awaitState(ctx context.Context, t *testing.T, cc *grpc.ClientConn, stateWant connectivity.State)
func awaitNotState(ctx context.Context, t *testing.T, cc *grpc.ClientConn, stateDoNotWant connectivity.State)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or factor out similarly, next to these (in clientconn_state_transition_test.go
), since this particular test doesn't seem to be able to reuse those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. And also, added a awaitNoStateChange
alongside the existing ones. At some point, we can move them to the testutils
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!!
call.go
Outdated
} | ||
return invoke(ctx, method, args, reply, cc, opts...) | ||
cc.idlenessMgr.onCallEnd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
clientconn.go
Outdated
if cc.dopts.idleTimeout == 0 { | ||
cc.idlenessMgr = newDisabledIdlenessManager() | ||
} else { | ||
cc.idlenessMgr = newAtomicIdlenessManager(cc, cc.dopts.idleTimeout) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
test/idleness_test.go
Outdated
sCtx, sCancel := context.WithTimeout(ctx, 3*defaultTestShortIdleTimeout) | ||
defer sCancel() | ||
go func() { | ||
for ; sCtx.Err() == nil; <-time.After(defaultTestShortTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
test/idleness_test.go
Outdated
// restarted when exiting idle, it will push the same address to grpc again. | ||
r := manual.NewBuilderWithScheme("whatever") | ||
backend := stubserver.StartTestService(t, nil) | ||
t.Cleanup(func() { backend.Stop() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Thanks. I've gotten so used to using t.Cleanup(func() { ... })
that I didn't even stop to think about the signature of this method.
test/idleness_test.go
Outdated
sCtx, sCancel := context.WithTimeout(ctx, 3*defaultTestShortIdleTimeout) | ||
defer sCancel() | ||
if cc.WaitForStateChange(sCtx, connectivity.Ready) { | ||
t.Fatalf("Connectivity state changed to %q when expected to stay in READY", cc.GetState()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. And also, added a awaitNoStateChange
alongside the existing ones. At some point, we can move them to the testutils
package.
Pulled the final set of changes into g3 and the tests seem healthy. Will merge on Monday. |
This PR adds support for channel idleness. Summary of changes:
idlenessManager
thatidlenessManager
idlenessManager
fromInvoke()
andNewStream()
DialContext
a little bit for better code flowaddTraceEvent
helper to emit channelz trace eventsWithIdleTimeout
dial option to setidle_timeout
30m
if unset (current default used by Java)0
RELEASE NOTES:
WithIdleTimeout
dial option