Return measurements to users in HTTP response #1130

shanson7 · 2018-11-07T16:09:12Z

Metrictank publishes a lot of metrics around cache hits, timings, etc. but while optimizing / analyzing individual queries, a lot of this gets lost. Jaeger is somewhat helpful but not sufficient for a variety of reasons (request volume, sampling, etc).

It would be nice if there could be a flag to indicate that stats should be returned with the response that can aid in figuring out what was triggered by this request. Some useful stats might be:

Time spent resolving the targets into a concrete list of series
Time spent fetching the data
Cache usage (tank, chunk cache, cassandra) plus timings
Pre-run logic (mergeSeries, sorting, etc)
Plan run time
The number of points pulled in, the number of points returned, the series pulled in.

Most of these stats are already collected for Jaeger and/or publishing as aggregate metrics.

Dieterbe · 2019-03-13T10:17:27Z

Jaeger is somewhat helpful but not sufficient for a variety of reasons (request volume, sampling, etc).

would it solve the problem if jaeger was sufficiently low overhead (e.g. via sampling or perhaps even disabled by default) and you had a way to make sure a specific request is not being sampled away? e.g. some kind of request flag to forcibly make sure a request gets instrumented via jaeger?

shanson7 · 2019-03-13T10:30:11Z

Part of it is visibility to end users, where our current tracing in Jaeger is far too verbose and not exposed.

These stats are visible in the query inspector in Grafana, so it's very convenient for our users to see these stats.

Dieterbe · 2019-03-13T10:43:56Z

OK.
let's see if the grafana team has any recommendations in terms of how to expose these stats. cc @daniellee @torkelo
my main concern - as mentioned in the PR - is that graphite responses are an array of series structs/dictionaries. so we can't really add a stats section to the response globally. so perhaps we should do it via http headers
maybe @DanCech as the graphite maintainer has a recommendation as well?

davkal · 2019-03-18T08:47:13Z

Prometheus exposes similar stats when a parameter is present: prometheus/prometheus#2408
It would make sense to render those on demand similarly to the query inspector.
Elastic returns some stats as well. The challenge as pointed out above is to come up with a good model that accommodates most.

bergquist · 2019-03-19T06:29:09Z

I think at the common Grafana level we should stay simple and have a list of name with durations. Right now I dont think this would be used outside the query inspector.

As for the Graphite response, I would love to use an object instead of an array but I'm guessing we would be the only use of such feature so I think I would prefer headers in this case. The Graphite datasource plugin could then translate those headers to the internal model.

Dieterbe · 2019-06-06T17:06:27Z

fixed by #1344

This was referenced Feb 28, 2019

Feature instrument render bloomberg/metrictank#69

Merged

Return stats about render endpoint #1230

Closed

Dieterbe mentioned this issue May 1, 2019

Support a "Show plan" endpoint with cost estimation #864

Closed

robert-milan mentioned this issue May 21, 2019

Roadmap #1319

Open

27 tasks

This was referenced Jun 2, 2019

Enhancement - indication of returned data is rollup data grafana/grafana#11700

Closed

render response metadata: stats #1344

Merged

Dieterbe closed this as completed Jun 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return measurements to users in HTTP response #1130

Return measurements to users in HTTP response #1130

shanson7 commented Nov 7, 2018

Dieterbe commented Mar 13, 2019

shanson7 commented Mar 13, 2019

Dieterbe commented Mar 13, 2019 •

edited

Loading

davkal commented Mar 18, 2019

bergquist commented Mar 19, 2019

Dieterbe commented Jun 6, 2019

Return measurements to users in HTTP response #1130

Return measurements to users in HTTP response #1130

Comments

shanson7 commented Nov 7, 2018

Dieterbe commented Mar 13, 2019

shanson7 commented Mar 13, 2019

Dieterbe commented Mar 13, 2019 • edited Loading

davkal commented Mar 18, 2019

bergquist commented Mar 19, 2019

Dieterbe commented Jun 6, 2019

Dieterbe commented Mar 13, 2019 •

edited

Loading