-
Notifications
You must be signed in to change notification settings - Fork 4.5k
xds: generic lrs client for load reporting #8250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: generic lrs client for load reporting #8250
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8250 +/- ##
==========================================
+ Coverage 82.15% 82.28% +0.13%
==========================================
Files 412 419 +7
Lines 40562 41847 +1285
==========================================
+ Hits 33322 34433 +1111
- Misses 5875 5962 +87
- Partials 1365 1452 +87
🚀 New features to boost your workflow:
|
78ab34a
to
ce9ba3d
Compare
e140792
to
2e2674f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine overall. Just a few comments inline.
func (ls *LoadStore) ReporterForCluster(clusterName, serviceName string) PerClusterReporter { | ||
panic("unimplemented") | ||
func (ls *LoadStore) ReporterForCluster(clusterName, serviceName string) *PerClusterReporter { | ||
if ls == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to panic if a nil
LoadStore
is used. Why not? It seems like a pretty severe programming error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed nil check. @easwars any reason why this check is there in existing code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably for tests to use a nil
load store. If that is not required anymore and tests are happy, we should be good to remove the nil
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of the nil check? Does any test fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Removed. I had removed others but looks like this was left.
} | ||
|
||
// CallStarted records a call started in the LoadStore. | ||
func (p *PerClusterReporter) CallStarted(locality string) { | ||
panic("unimplemented") | ||
if p == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above. And below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
func (rcd *rpcCountData) decrInProgress() { | ||
atomic.AddUint64(rcd.inProgress, negativeOneUInt64) // atomic.Add(x, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the const doesn't seem to buy us anything, since we're already needing to comment what this means. IMO delete the constant and inline it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it inline
s = rld.sum | ||
rld.sum = 0 | ||
c = rld.count | ||
rld.count = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do something like this, which might(?) be more quickly understood:
s = rld.sum | |
rld.sum = 0 | |
c = rld.count | |
rld.count = 0 | |
s, rld.sum = rld.sum, 0 | |
c, rld.count = rld.count, 0 |
Or,
s = rld.sum | |
rld.sum = 0 | |
c = rld.count | |
rld.count = 0 | |
s, c = rld.sum, rld.count | |
rld.sum, rld.count = 0, 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did the first one
c = rld.count | ||
rld.count = 0 | ||
rld.mu.Unlock() | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use bare return
s that return values. It can be hard to understand what's going on, mainly for longer functions.
return | |
return s, c |
Or pair with the second option above:
func (rld *rpcLoadData) loadAndClear() (float64, int64) {
rld.mu.Lock()
defer rld.mu.Unlock()
s, c := rld.sum, rld.count
rld.sum, rld.count = 0, 0
return s, c
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah added them to return
return c, err | ||
} | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mentioned in the PR description. It is because the tests had compilation errors because some things were added in 8183 PR. Now its merged so i have rebased on latest and uncommented it.
I mostly skimmed the changes - @easwars may also want to take a quick pass. The commits here aren't quite as easy to review as the last change, since they go file-by-file. It would have been easier if one commit copied all the files, so that we could just skip that one commit when reviewing. |
9b88d68
to
a263158
Compare
edabfbc
to
739c6b3
Compare
739c6b3
to
882bab4
Compare
@@ -150,9 +155,24 @@ func (lrs *streamImpl) sendLoads(ctx context.Context, stream clients.Stream, clu | |||
case <-tick.C: | |||
case <-ctx.Done(): | |||
return | |||
case <-lrs.finalSendRequest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we don't use these two new fields, finalSendRequest
and finalSendDone
and instead attempt to the send the last load report anyways when ctx
is done? If we do that, we won't even have to accept a context from the user in Stop()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. We can go ahead with current approach involving finalSendRequest
and finalSendDone
because sending load report after the top level context ctx
done is not reliable as the stream context is child of top level context and the cancelation of ctx
will propagate to stream context is as well.
@@ -43,38 +68,372 @@ type LoadStore struct { | |||
// attempt to flush any unreported load data to the LRS server. It will either | |||
// wait for this attempt to complete, or for the provided context to be done | |||
// before canceling the LRS stream. | |||
func (ls *LoadStore) Stop(ctx context.Context) error { | |||
panic("unimplemented") | |||
func (ls *LoadStore) Stop(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a comment which could possibly simplify the implementation.
Here though, I still think we should remove the last two paragraphs in this doc string. There is no mention anywhere else about reference counting and sharing of LRS streams etc, and that is completely implementation detail that a user of this API does not need to know about.
dca14fc
to
08149d8
Compare
@@ -43,38 +68,372 @@ type LoadStore struct { | |||
// attempt to flush any unreported load data to the LRS server. It will either | |||
// wait for this attempt to complete, or for the provided context to be done | |||
// before canceling the LRS stream. | |||
func (ls *LoadStore) Stop(ctx context.Context) error { | |||
panic("unimplemented") | |||
func (ls *LoadStore) Stop(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove the "ctx
" from the docstring. Just say "provided context"
This is the change to make generic LRS client for load reporting to LRS server.
The PR copies the existing
xds/internal/xdsclient/load/store.go
,xds/internal/xdsclient/transport/lrs/lrs_stream.go
,xds/internal/xdsclient/load/store_test.go
xds/internal/xdsclient/tests/loadreport_test.go
from internal xdsclient code and then modify them to use the generic client types and interfaces. Each "copy" commit is followed by the "modify" commit for that file. Reviewers can start from reviewing the "modify" commit.
PS: Currently loadreport_test.go has compilation error as so its commented out as it is depends on some of the functions added in #8183
RELEASE NOTES: None