-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AC hit rate metrics with prometheus labels (alternative implementation with decorator) #358
base: master
Are you sure you want to change the base?
Conversation
This is a draft, not ready to be merged. I wait with test cases until getting feedback about the main functionality. Same functionality as in buchgr#350 - Prometheus counter for cache hit ratio of only AC requests. - Support for prometheus labels based on custom HTTP and gRPC headers. but implemented in an alternative way: - disk.io implements two interfaces: cache.CasAcCache, cache.Stats - cache.metricsdecorator is decorator for cache.CasAcCache and provides prometheus metrics. Pros with this alternative implementation: - Should allow non-prometheus metrics as requested in buchgr#355 - Avoid the question about if the counters should be placed in disk.go or http.go/grpc*.go. If placing in disk.go, there are issues about how to avoid incrementing counter twice for the same request (both in Get and in GetValidatedActionResult) and at the same time count found AC but missing CAS, as cache miss. - Makes headers available also for logic in disk.go, which could be useful for other functionality in the future. - Metrics can be separated from logging, and still not require injecting counter increment invocations in tons of places. Incrementing from a few well defined places minimize the risk for bugs in the metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had time to fully digest this or the original version yet, but this feels a bit cleaner so far. I'll get back to you late this week with some more feedback.
@@ -649,6 +649,9 @@ func cacheFilePath(kind cache.EntryKind, cacheDir string, hash string) string { | |||
// value from the CAS if it and all its dependencies are also available. If | |||
// not, nil values are returned. If something unexpected went wrong, return | |||
// an error. | |||
// TODO Consider separating implementation of cache.AcStore interface, to open up | |||
// possibilities combining that functionality with proxies, and also allow | |||
// bazel-remote configurations with proxy but no local disk storage? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A proxy-only mode has been on my todo-list for a while. I have considered doing this as a short-circuit inside disk.Cache (it might be time to rename or split the package at that point).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mostynb Does it make sense to bring back the cache.Cache
interface that the proxies and disk cache used to implement for something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mostynb Is it something more than implementation of GetValidatedActionResult, that you expect will be needed from disk.io, in a proxy-only-mode?
@@ -344,8 +348,7 @@ func main() { | |||
} | |||
|
|||
validateAC := !c.DisableHTTPACValidation | |||
h := server.NewHTTPCache(diskCache, accessLogger, errorLogger, validateAC, c.EnableACKeyInstanceMangling, gitCommit) | |||
|
|||
h := server.NewHTTPCache(casAcCache, diskCache, accessLogger, errorLogger, validateAC, c.EnableACKeyInstanceMangling, gitCommit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit strange to pass both casAcCache and diskCache here. Doesn't the former wrap the latter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewHTTPCache takes two interfaces: cache.BlobAcStore and cache.Stats.
The file disk.go implements both those interfaces.
But metricsdecorator is a decorator only for cache.BlobAcStore. The main reason is that I imagine using other metricsdecorator instances wrapping also the proxies, replacing the current counters in the proxies. But the proxies does not implement cache.Stats and therfore metricsdecorator also does not. (Using metricsdecorator instances also for proxies would involve setting counter name in call to NewMetricsDecorator, optional non-AC metrics support , etc)
The GRPC server is not using the cache.Stats interface.
// TODO Document interface | ||
type CasAcCache interface { | ||
// TODO change to io.ReadCloser? | ||
Put(kind EntryKind, hash string, size int64, rdr io.Reader, reqCtx RequestContext) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use a map[string]string instead of a RequestContext interface in these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that first, but ended up with cache.RequestContext because:
-
gRPC and HTTP/2 is using canonical lowercase header names, but the golang HTTP/1.1 library normalize header names to start with capital letter. I want to hide that difference, but avoid translating headers that are never accessed, and instead translate on demand in:
Lines 395 to 398 in 4b2b8b2
func (h *reqCtxHttp) GetHeader(headerName string) (headerValues []string) { headerName = strings.Title(headerName) if headerValues, ok := h.request.Header[headerName]; ok { return headerValues -
Minimize overhead for creating cache.RequestContext instance in general (only allocate struct and store a pointer)
-
Allows extending cache.RequestContext with additional methods that returns non-header meta information. Such as Host, RemoteAddr or Cookie from HTTP requests. Or any other information that could be added by bazel-remote’s HTTP/GRPC servers, like enum indicating if original request were HTTP or GRPC.
-
Allows extending cache.RequestContext to allow proxies to cancel outgoing requests, if associated incomming request is canceled. I have not looked into details, but seems to be something related to that in underlying GRPC and HTTP contexts.
-
Allows extending cache.RequestContext in general, instead of having to update all Get/Put/Contains invocations with additional parameters, all over bazel-remote codebase.
-
Allows propagating custom information in general between a chain of different cache.BlobStore implementations (such as proxies, decorators, …)
|
||
// TODO Document interface | ||
type AcStore interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this interface used anywhere outside BlobAcStore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I’m only thinking that having a separate AcStore interface might be more generic and flexible. Especially if GetValidatedActionResult method would be extracted from disk.go in the future. Or if there would be decorators wrapping only AcStore. But I don’t have much experience of embedding interfaces and structs in golang. It might work with only BlobStore and BlobAcStore interface, if you think it is preferable to avoid AcStore?
Hi @mostynb. I would be happy to continue working on this patch, what do you think? |
@ulrfa Is it possible to break this into two PRs? Basically thinking:
I think it will be easier to digest and review. If not I understand and will carve out some time later to look it over. |
Hi @bdittmer,
That is all changes except the three files listed below.
That is the changes in the three source files:
Perhaps the list of source files above is enough for decision about if you would want me to go forward with this. If you decide yes, then I will add test cases and can also split into several PRs if you want. Is that OK? I will probably not have time to work on this until after next week. |
This is a draft, not ready to be merged. I wait with test cases until
getting feedback about the main functionality.
Same functionality as in #350
Prometheus counter for cache hit ratio of only AC requests.
Support for prometheus labels based on custom HTTP and gRPC headers.
but implemented in an alternative way:
disk.io implements two interfaces: cache.CasAcCache, cache.Stats
cache.metricsdecorator is decorator for cache.CasAcCache and
provides prometheus metrics.
Pros with this alternative implementation:
Should allow non-prometheus metrics as requested in
Metrics interface(s) #355
Avoid the question about if the counters should be placed in
disk.go or http.go/grpc*.go. If placing in disk.go, there are
issues about how to avoid incrementing counter twice for the
same request (both in Get and in GetValidatedActionResult)
and at the same time count found AC but missing CAS, as cache miss.
Makes headers available also for logic in disk.go, which could be
useful for other functionality in the future.
Metrics can be separated from logging, and still not require
injecting counter increment invocations in tons of places.
Incrementing from a few well defined places minimize the risk
for bugs in the metrics.