-
-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Prometheus metrics implementation #1218
Comments
Another checkmark worth adding (so it's not forgotten): minimize metrics overhead, allow disabling |
Added :) |
One interesting thing in Prometheus that's currently under preview in 2.40.x is the new native/sparse histograms. The current histograms have a slight usability issue that the buckets need to be defined upfront and you have to hope to get them right. In the case of GTS that might also be hard because instances of different sizes or running on smol devices might result in fairly different measurements for things like HTTP response latency. The native type doesn't suffer from this making them much more useful in practice. |
If we implement opentelemetry tracing (#1230), it seems we can also use a prometheus exporter for that, to allow prometheus to pull the opentelemetry metrics: https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus |
Useful metrics I planned to add but just didn't get around to yet:
|
Re the requested "Minimize metrics overhead, allow disabling" @NyaaaWhatsUpDoc: I think it's better to filter that on Prometheus level? Most applications I come across don't have something like that and for them you'd have to filter that in Prometheus anyway, making that a better source of truth - does this make sense, what do you think? |
My point is to not have Prometheus metrics being gathered in the binary, and ideally not compiled into the binary, if not necessary. So no overhead if not enabled. |
Ah it's about not even gathering them, reducing load in GtS itself - got it, thanks! |
I don't know if Mastodon exports openmetrics but if it does it would be wise to follow their metrics naming. |
#1623 opens up the following method:
"tracing" and "metrics" should probably be unified under "observability" regarding build flags etc.? |
real proper tracing is something I'd not use, metrics on the hand I'd use heavily - so turning them on and off separately makes sense to me |
I could work on this, getting the runtime metrics and Gin HTTP metrics was rather straightforward |
Closed by #2334 We might want to add additional metrics later, but we can make smaller issues for that. For now, we record Go, Gin, and Bundb metrics with this thing, which is plenty to get started with! |
We should look into instrumenting GoToSocial with prometheus metrics gathering, so that admins can glean more insight into the performance and load of their instance.
We should start with just one or two useful metrics and play around with it, rather than worrying about implementing everything at once.
We should be wary of performance impact, and make this toggleable so that if it's off it doesn't cause any memory/cpu overhead.
Likely we should have this 'off' by default, since most people likely won't need it or want it.
Here's a little checklist to get us going:
The text was updated successfully, but these errors were encountered: