Expose database query / application metrics on internal `/metrics` endpoint #620

brahman81 · 2018-08-28T17:03:47Z

To assist during debugging and capacity planning, it would prove useful to expose database metrics on the /metrics Horizon endpoint.

Time spent performing the various database calls (offers, transactions, assets, accounts, etc)
Total requests per second to the core database
Total requests per second to the horizon database

Ideally we would namespace the metrics to distinguish Horizon vs Core database queries.

The text was updated successfully, but these errors were encountered:

MonsieurNicolas · 2018-08-31T20:04:07Z

this should probably not be internet facing

brahman81 · 2018-09-05T11:03:56Z

Thanks @MonsieurNicolas, it makes sense to somehow restrict access to these extra database metrics.

A user could potentially restrict access to the /metrics endpoint before enabling these new db stats via a config option ? These are nice metrics to graph, especially when debugging or looking at capacity planning...

brahman81 · 2019-03-22T11:45:03Z

Having a second http listener started on an alternate port 8001 would be ideal imo, access can easily be restricted by most users and it would be a big ops win to be able to extract these types of metrics from Horizon.

I have dreams of Horizon metrics being in Prometheus, Grafana, etc.

brahman81 · 2019-10-15T19:12:14Z

Is instrumenting the application with https://github.com/prometheus/client_golang an option ? It would avoid the need for an external exporter and allow Prometheus to scrape Horizon directly ?

bartekn · 2020-02-11T21:19:33Z

Just added a couple PRs connected to this:

services/horizon: Move /metrics to internal server #2261 services/horizon: Move /metrics to internal server
services/horizon/actions: Add Prometheus text exposition format in /metrics #2265 services/horizon/actions: Add Prometheus text exposition format in /metrics
services/horizon: Add new ingestion system metrics to /metrics #2260 services/horizon: Add new ingestion system metrics to /metrics

When all are merged I'll deploy it to the staging server and we can try integrate it with our Prometheus server.

bartekn · 2020-02-12T18:10:03Z

All PRs above are merged. When it comes to DB metrics it requires a small refactor of support/db package so moving this to 1.1.0. cc @ire-and-curses.

bartekn · 2020-07-24T14:05:10Z

Is instrumenting the application with https://github.com/prometheus/client_golang an option ?

It's done in #2846. It should help adding more metrics soon.

@stellar/horizon-committers if you have ideas regarding new metrics please add them as a comment here. Here's my list:

Duration of the processing time for each ingestion processor. Per change/transaction breakdown.
Counter for each tx/op error type returned by txsub.
Duration of the order book graph state update per ledger.
LedgerEntryChangeCache compression ratio stats.

When it comes to SQL queries stats, I'm wondering if we should do it. First, majority of endpoints send a single SQL query to get results so we can easily track this using HTTP stats. Second, often we modify SQL query string for the same query type. Obvious example is inserts batch builders. We'd need to name each query and probably have a second param explaining the number of rows being added.

2opremio · 2020-07-24T14:29:37Z

If it's not already there. How about ingestion throughput (ledgers/time) and captive core stats (CPU and memory consumption of captive core). Also, the reingestion status (how many workers, what ledger ranges are being reingested, what's the progress in each of them).

…

On Fri, Jul 24, 2020, 16:05 Bartek Nowotarski ***@***.***> wrote: Is instrumenting the application with https://github.com/prometheus/client_golang an option ? It's done in #2846 <#2846>. It should help adding more metrics soon. @stellar/horizon-committers <https://github.com/orgs/stellar/teams/horizon-committers> if you have ideas regarding new metrics please add them as a comment here. Here's my list: - Duration of the processing time for each ingestion processor. Per change/transaction breakdown. - Counter for each tx/op error type returned by txsub. - Duration of the order book graph state update per ledger. - LedgerEntryChangeCache <https://godoc.org/github.com/stellar/go/exp/ingest/io#LedgerEntryChangeCache> compression ratio stats. When it comes to SQL queries stats, I'm wondering if we should do it. First, majority of endpoints send a single SQL query to get results so we can easily track this using HTTP stats. Second, often we modify SQL query string for the same query type. Obvious example is inserts batch builders. We'd need to name each query and probably have a second param explaining the number of rows being added. — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#620 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASA4JEAK72LL4ZWECWJO73R5GIKPANCNFSM4FSADRSQ> .

bartekn · 2020-07-24T14:53:55Z

If it's not already there. How about ingestion throughput (ledgers/time)
and captive core stats (CPU and memory consumption of captive core). Also,
the reingestion status (how many workers, what ledger ranges are being
reingested, what's the progress in each of them).

I think you're talking about reingestion, right?

We already have a summary for processed ledgers (that includes a counter) but throughput in the online mode will be stable at 1 ledger per 5 seconds on average. I don't think we have Captive Core CPU and memory stats available via Go so it should be done at OS level. For reingestion stats (# of workers, throughput - makes sense here, progress per worker, etc.) 👍.

bartekn · 2020-08-17T15:09:46Z

Added one more metric here: #2921. Closing this, let's open a separate issue for each metric when it's really needed.

bartekn added the help wanted label Aug 28, 2018

bartekn added this to the Horizon v0.15.0 milestone Sep 5, 2018

bartekn added the horizon label Sep 5, 2018

bartekn modified the milestones: Horizon v0.15.0, Horizon v0.16.0 Oct 30, 2018

bartekn modified the milestones: Horizon v0.16.0, Horizon next minor release Jan 22, 2019

bartekn removed this from the Horizon next minor release milestone Jun 13, 2019

ire-and-curses added the Hacktoberfest https://hacktoberfest.digitalocean.com/details label Sep 30, 2019

bartekn changed the title ~~expose database query metrics on /metrics endpoint~~ Expose database query / application metrics on internal /metrics endpoint Nov 12, 2019

bartekn added this to the Horizon 0.24.0 milestone Nov 12, 2019

bartekn modified the milestones: Horizon 0.24.0, Horizon 0.25.0 Dec 3, 2019

ire-and-curses modified the milestones: Horizon 0.25.0, Horizon 0.26.0 Jan 7, 2020

abuiles removed the Hacktoberfest https://hacktoberfest.digitalocean.com/details label Jan 7, 2020

ire-and-curses mentioned this issue Feb 12, 2020

Remove /metrics from root URL #2268

Closed

bartekn modified the milestones: Horizon 1.0.0-stable, Horizon 1.1.0 Feb 12, 2020

ire-and-curses removed this from the Horizon 1.0.1 milestone Mar 17, 2020

ire-and-curses removed the help wanted label Mar 17, 2020

bartekn self-assigned this May 13, 2020

bartekn added this to the Horizon 1.5.0 milestone Jun 4, 2020

bartekn removed this from the Horizon 1.5.0 milestone Jul 1, 2020

bartekn added this to the Horizon 1.7.0 milestone Jul 14, 2020

This was referenced Jul 22, 2020

services/horizon: Add additional DB metrics #2844

Merged

services/horizon: Refactor metrics, use Prometheus package #2846

Merged

bartekn modified the milestones: Horizon 1.7.0, Horizon 1.8.0 Aug 11, 2020

bartekn closed this as completed Aug 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose database query / application metrics on internal `/metrics` endpoint #620

Expose database query / application metrics on internal `/metrics` endpoint #620

brahman81 commented Aug 28, 2018

MonsieurNicolas commented Aug 31, 2018

brahman81 commented Sep 5, 2018

brahman81 commented Mar 22, 2019

brahman81 commented Oct 15, 2019

bartekn commented Feb 11, 2020

bartekn commented Feb 12, 2020

bartekn commented Jul 24, 2020

2opremio commented Jul 24, 2020 via email

bartekn commented Jul 24, 2020

bartekn commented Aug 17, 2020

Expose database query / application metrics on internal /metrics endpoint #620

Expose database query / application metrics on internal /metrics endpoint #620

Comments

brahman81 commented Aug 28, 2018

MonsieurNicolas commented Aug 31, 2018

brahman81 commented Sep 5, 2018

brahman81 commented Mar 22, 2019

brahman81 commented Oct 15, 2019

bartekn commented Feb 11, 2020

bartekn commented Feb 12, 2020

bartekn commented Jul 24, 2020

2opremio commented Jul 24, 2020 via email

bartekn commented Jul 24, 2020

bartekn commented Aug 17, 2020

Expose database query / application metrics on internal `/metrics` endpoint #620

Expose database query / application metrics on internal `/metrics` endpoint #620