-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add loadSuccessTime() and loadFailureTime() to CacheStats #409
Comments
Can you explain the benefits from the perspective of the metrics themselves? I understand it is nice consistency for metric reporting, but I do not know how you would operationally react to this data. I suspect that is why they were combined in Guava, where this was never asked for. More often you would be trying to understand why failures occurred at all, rather than the time spent on each. Note that you can supply your own |
@ben-manes One of the big benefits would be in unifying both success and fail outcomes under one metric name, something like Many monitoring systems don't support shipping a timer Having unified the metric names, we can plot total attempted loads by simply plotting the count statistic shipped by Also, plotting error ratio as a function of throughput (rather than error rate which isn't as useful) is more intuitive. It becomes More...To really go to the next level, if there are different failure modes under which a load operation can fail, tag with that detail and a coarse tag of More more...The really ideal case would be if Caffeine could allow for Micrometer to record with a regular timer. The most useful latency metrics are max and high percentiles, but that detail is lost in the abstraction right now. The best we can do with a |
For more advanced cases, you could implement your own The cache only knows of a failure when either an exception is thrown or the expected entry is absent. This is reported identically and, most often, is unexpected failures that cause application errors. The fine grained causes would have to be instrumented in application code. Given that failures that we can report are typically an application bug to be fixed, I'm still confused as to the operational value. Yes, for symmetry it is a reasonable ask. But I am confused as to how this metric would help developers? It would seem that this particular ask is not very valuable but that supplying a custom |
I am working on v3 where we could make this change. Unfortunately my concerns were not addressed in your reply so I am still inclined to not provide this. I do not see how it would assist operational teams (performance, SREs) and therefore lean against tracking metrics that don’t carry their weight. |
FYI, I implemented StatsCounter for micrometer: micrometer-metrics/micrometer#2163 So micrometer users who want the full breakdown will be able to get what they want that way. |
Thanks, I think that's a nice approach. Closing. |
@john-karp The |
CacheStats tracks loadSuccessCount and loadFailureCount independently, but doesn't do a corresponding breakdown of load time.
I think it would be helpful to know if time is being spent in success vs failure.
Came up in micrometer-metrics/micrometer#2020
The text was updated successfully, but these errors were encountered: