[feature] Prometheus metrics implementation #1218

tsmethurst · 2022-12-06T12:02:46Z

We should look into instrumenting GoToSocial with prometheus metrics gathering, so that admins can glean more insight into the performance and load of their instance.

We should start with just one or two useful metrics and play around with it, rather than worrying about implementing everything at once.

We should be wary of performance impact, and make this toggleable so that if it's off it doesn't cause any memory/cpu overhead.

Likely we should have this 'off' by default, since most people likely won't need it or want it.

Here's a little checklist to get us going:

Decide which metrics to record.
Decide on implementation / library.
Consider integrating metrics view into the admin panel - worth it? How to do it?
Minimize metrics overhead, allow disabling
Consider how this will interact with opentelemetry (if we end up doing that): [feature] Instrument GoToSocial with OpenTelemetry or similar tracing thingy #1230

NyaaaWhatsUpDoc · 2022-12-06T12:21:21Z

Another checkmark worth adding (so it's not forgotten): minimize metrics overhead, allow disabling

tsmethurst · 2022-12-06T13:30:09Z

Another checkmark worth adding (so it's not forgotten): minimize metrics overhead, allow disabling

Added :)

daenney · 2022-12-09T16:23:25Z

One interesting thing in Prometheus that's currently under preview in 2.40.x is the new native/sparse histograms. The current histograms have a slight usability issue that the buckets need to be defined upfront and you have to hope to get them right. In the case of GTS that might also be hard because instances of different sizes or running on smol devices might result in fairly different measurements for things like HTTP response latency. The native type doesn't suffer from this making them much more useful in practice.

tsmethurst · 2022-12-15T15:14:22Z

If we implement opentelemetry tracing (#1230), it seems we can also use a prometheus exporter for that, to allow prometheus to pull the opentelemetry metrics: https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus

LittleFox94 · 2022-12-17T22:09:05Z

Useful metrics I planned to add but just didn't get around to yet:

the usual runtime metrics (goroutines, memory, ... standard Go collectors)
number of users
number of posts
number of instances federated with
the usual HTTP request metrics (count by status code, latency, request/response body size, ...)
- split between server-to-server / server-to-client
- s2s split between if requesting or replying
database metrics, especially latency (as bun already warns about that)

LittleFox94 · 2023-04-15T12:38:02Z

Re the requested "Minimize metrics overhead, allow disabling" @NyaaaWhatsUpDoc: I think it's better to filter that on Prometheus level? Most applications I come across don't have something like that and for them you'd have to filter that in Prometheus anyway, making that a better source of truth - does this make sense, what do you think?

NyaaaWhatsUpDoc · 2023-04-15T12:59:57Z

My point is to not have Prometheus metrics being gathered in the binary, and ideally not compiled into the binary, if not necessary. So no overhead if not enabled.

LittleFox94 · 2023-04-15T13:02:51Z

Ah it's about not even gathering them, reducing load in GtS itself - got it, thanks!

raspbeguy · 2023-05-05T12:40:47Z

I don't know if Mastodon exports openmetrics but if it does it would be wise to follow their metrics naming.

Tsuribori · 2023-10-31T16:17:34Z

#1623 opens up the following method:

If we implement opentelemetry tracing (#1230), it seems we can also use a prometheus exporter for that, to allow prometheus to pull the opentelemetry metrics: https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus

"tracing" and "metrics" should probably be unified under "observability" regarding build flags etc.?

LittleFox94 · 2023-10-31T16:58:03Z

#1623 opens up the following method:

If we implement opentelemetry tracing (#1230), it seems we can also use a prometheus exporter for that, to allow prometheus to pull the opentelemetry metrics: https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus

"tracing" and "metrics" should probably be unified under "observability" regarding build flags etc.?

real proper tracing is something I'd not use, metrics on the hand I'd use heavily - so turning them on and off separately makes sense to me
but I'd be fine with all that being in the binary all the time and it being a runtime config (config file/CLI flags) instead of build flags

Tsuribori · 2023-11-02T11:18:47Z

I could work on this, getting the runtime metrics and Gin HTTP metrics was rather straightforward

tsmethurst · 2023-11-23T18:10:15Z

Closed by #2334

We might want to add additional metrics later, but we can make smaller issues for that. For now, we record Go, Gin, and Bundb metrics with this thing, which is plenty to get started with!

tsmethurst added future tech Powerful technology to be added in the future. performance labels Dec 6, 2022

tsmethurst mentioned this issue Dec 8, 2022

[feature] Instrument GoToSocial with OpenTelemetry or similar tracing thingy #1230

Closed

Tsuribori mentioned this issue Nov 5, 2023

[feature] Initial metrics #2334

Merged

9 tasks

tsmethurst closed this as completed Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] Prometheus metrics implementation #1218

[feature] Prometheus metrics implementation #1218

tsmethurst commented Dec 6, 2022 •

edited

Loading

NyaaaWhatsUpDoc commented Dec 6, 2022

tsmethurst commented Dec 6, 2022

daenney commented Dec 9, 2022

tsmethurst commented Dec 15, 2022

LittleFox94 commented Dec 17, 2022

LittleFox94 commented Apr 15, 2023

NyaaaWhatsUpDoc commented Apr 15, 2023

LittleFox94 commented Apr 15, 2023

raspbeguy commented May 5, 2023

Tsuribori commented Oct 31, 2023

LittleFox94 commented Oct 31, 2023

Tsuribori commented Nov 2, 2023

tsmethurst commented Nov 23, 2023

[feature] Prometheus metrics implementation #1218

[feature] Prometheus metrics implementation #1218

Comments

tsmethurst commented Dec 6, 2022 • edited Loading

NyaaaWhatsUpDoc commented Dec 6, 2022

tsmethurst commented Dec 6, 2022

daenney commented Dec 9, 2022

tsmethurst commented Dec 15, 2022

LittleFox94 commented Dec 17, 2022

LittleFox94 commented Apr 15, 2023

NyaaaWhatsUpDoc commented Apr 15, 2023

LittleFox94 commented Apr 15, 2023

raspbeguy commented May 5, 2023

Tsuribori commented Oct 31, 2023

LittleFox94 commented Oct 31, 2023

Tsuribori commented Nov 2, 2023

tsmethurst commented Nov 23, 2023

tsmethurst commented Dec 6, 2022 •

edited

Loading