-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Track metrics around usage of the GitHub API #3273
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this.
I do think it might be easier instrumenting on the *http.Client
level with a rounder tripper, as this would be a lot less subject to refactoring and breakages and has access to a lot more data.
pkg/querier/vcs/client/metrics.go
Outdated
}, | ||
[]string{"path", "status"}, | ||
), | ||
APIRateLimit: promauto.With(reg).NewGauge(prometheus.GaugeOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that this metric might not be too helpful/accurate:
My understanding is that we are subject to rate limits of the github user that authenticated via oauth2. So if we have two github users using the service at the same time we would constantly have a jumping value, between those two values of remaining requests.
I do think this is something important to watch. Maybe this could be a histogram with fitting buckets, that would show us how close to being rate limited we got.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right here. The rate limit remaining here is per-user (doc) since we're using a user token. If we had an installation token, the rate limit would be per app.
How I'm using the metric here is not helpful at all. Like you said, it's going to jump to whatever remaining rate limit the last token had. I'm going to rethink how to track this, as a gauge is clearly not the right approach.
pkg/querier/vcs/client/metrics.go
Outdated
), | ||
APIRateLimit: promauto.With(reg).NewGauge(prometheus.GaugeOpts{ | ||
Namespace: "pyroscope", | ||
Name: "vcs_github_rate_limit", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name: "vcs_github_rate_limit", | |
Name: "vcs_github_remaining_request_quota", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for this new revision, I have not fully tested it myself, but it looks all correct code wise |
Related: https://github.com/grafana/pyroscope-squad/issues/138
We have metrics surrounding how the VCS service behaves (latency, errors, etc) but we have limited insight into our usage of the GitHub API. This PR tracks our GitHub API usage in two ways: