Skip to content

Conversation

@dhartunian
Copy link
Collaborator

Previously, it was not possible to emit metrics with static labels that were known at startup. Labels would need to be dynamically added after Metadata was constructed. This made it difficult to define groups of metrics with the same name but different labels.

This commit adds two new fields to the metric.Metadata struct, which are meant to enable a backwards-compatible transition to labeled metrics from today's definitions which reify the label values into the metric name.

The Prometheus export code is modified to accept a flag that governs whether static labels are requested. If so, the labeled name of the metric be used if available.

A future PR will add a /metrics endpoint that is like /_status/ vars except that it will emit the static label version of metrics when they have those definitions available.

Part of: #142570

Release note: None

@dhartunian dhartunian requested review from a team as code owners March 26, 2025 19:32
@dhartunian dhartunian requested review from Abhinav1299, aa-joshi, angles-n-daemons, arjunmahishi and Copilot and removed request for a team March 26, 2025 19:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@abinesha312
Copy link

this should solve your problem

if useStaticLabels && len(md.StaticLabels) > 0 {

@dhartunian dhartunian force-pushed the davidh/push-orrrntktwptm branch 2 times, most recently from f5fc352 to 9966a80 Compare March 27, 2025 16:55
Copy link
Contributor

@angles-n-daemons angles-n-daemons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, one question - for tsdb, is the plan to add scraping with static labels = false to maintain backwards compatibility?

// GetName is a method on Metadata
GetName() string
// GetLabeledName is a method on Metadata
GetLabeledName() string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that this is explicitly necessary, but is it possible to modify the GetName function in the same way the GetLabels function was modified, or does it break compatibility with prom?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I hesitated because the PR was getting big. I've got a commit that does the bulk rename that I'll put up separately for easier review once this merges.

@dhartunian
Copy link
Collaborator Author

This looks good overall, one question - for tsdb, is the plan to add scraping with static labels = false to maintain backwards compatibility?

@angles-n-daemons correct. TSDB will not be scraping with static labels.

Previously, it was not possible to emit metrics with static labels
that were known at startup. Labels would need to be dynamically added
after `Metadata` was constructed. This made it difficult to define
groups of metrics with the same name but different labels.

This commit adds two new fields to the `metric.Metadata` struct,
which are meant to enable a backwards-compatible transition to labeled
metrics from today's definitions which reify the label values into the
metric name.

The Prometheus export code is modified to accept a flag that governs
whether static labels are requested. If so, the labeled name of the
metric be used if available.

A future PR will add a `/metrics` endpoint that is like `/_status/
vars` except that it will emit the static label version of metrics
when they have those definitions available.

Part of: cockroachdb#142570

Release note: None
@dhartunian dhartunian force-pushed the davidh/push-orrrntktwptm branch from 9966a80 to e2edfc6 Compare April 7, 2025 17:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 19 out of 20 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • pkg/util/metric/BUILD.bazel: Language not supported

@dhartunian
Copy link
Collaborator Author

TFTR!

bors r=angles-n-daemons

@craig
Copy link
Contributor

craig bot commented Apr 7, 2025

@craig craig bot merged commit 2733799 into cockroachdb:master Apr 7, 2025
24 checks passed
craig bot pushed a commit that referenced this pull request Apr 9, 2025
143536: metric: add /metrics endpoint with static labels r=angles-n-daemons a=dhartunian

A new metrics endpoint that's Prometheus-compatible, is now available
at `/metrics` similarly to `/_status/vars`. This endpoint will emit
statically labeled metrics if the metric metadata defines static
labels and a labeled name.

Here's an example of what this looks like on a demo cluster with
multitenancy using the few metrics that have been migrated in this PR
from SQL:

```
http GET http://localhost:8080/_status/vars cluster==system | grep sql_insert_count
# HELP sql_insert_count_internal Number of SQL INSERT statements successfully executed (internal queries)
# TYPE sql_insert_count_internal counter
sql_insert_count_internal{node_id="1",tenant="system"} 138
sql_insert_count_internal{node_id="1",tenant="demoapp"} 152
# HELP sql_insert_count Number of SQL INSERT statements successfully executed
# TYPE sql_insert_count counter
sql_insert_count{node_id="1",tenant="system"} 2
sql_insert_count{node_id="1",tenant="demoapp"} 73

~/go/src/github.com/cockroachdb/cockroach remotes/dhartunian/davidh/push-orrrntktwptm* ≡
❯ 
http GET http://localhost:8080/metrics cluster==system | grep sql_insert_count

~/go/src/github.com/cockroachdb/cockroach remotes/dhartunian/davidh/push-orrrntktwptm* ≡
❯ 
http GET http://localhost:8080/metrics cluster==system | grep sql_count
# HELP sql_count Number of SQL UPDATE statements successfully executed (internal queries)
# TYPE sql_count counter
sql_count{node_id="1",tenant="system",query_type="select",query_internal="true"} 172
sql_count{node_id="1",tenant="system",query_type="update"} 0
sql_count{node_id="1",tenant="system",query_type="update",query_internal="true"} 67
sql_count{node_id="1",tenant="system",query_type="select"} 0
sql_count{node_id="1",tenant="system",query_type="delete",query_internal="true"} 29
sql_count{node_id="1",tenant="system",query_type="insert",query_internal="true"} 138
sql_count{node_id="1",tenant="system",query_type="delete"} 0
sql_count{node_id="1",tenant="system",query_type="insert"} 2
sql_count{node_id="1",tenant="demoapp",query_type="update",query_internal="true"} 124
sql_count{node_id="1",tenant="demoapp",query_type="update"} 0
sql_count{node_id="1",tenant="demoapp",query_type="insert"} 269
sql_count{node_id="1",tenant="demoapp",query_type="insert",query_internal="true"} 152
sql_count{node_id="1",tenant="demoapp",query_type="select",query_internal="true"} 243
sql_count{node_id="1",tenant="demoapp",query_type="delete",query_internal="true"} 57
sql_count{node_id="1",tenant="demoapp",query_type="select"} 593
sql_count{node_id="1",tenant="demoapp",query_type="delete"} 0
```

Resolves: #142570

Release note (ops change): Prometheus metrics are now also available
at the `/metrics` endpoint in addition to `/_status/vars`. This new
endpoint will evolve more rapidly as we migrate metrics to use labels
where previously different metric names were defined. Customers can
continue to use `/_status/vars` where the names will not change.


----

First commit can be reviewed here: #143511

Co-authored-by: David Hartunian <davidh@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants