Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 support in metric and label names #922

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

fedetorres93
Copy link

@fedetorres93 fedetorres93 commented Feb 7, 2024

Adds UTF-8 support for metric and label names.

These changes are based on the work done on the Prometheus common libraries here and here

  • The prometheus-metrics-exposition-formats module will use the new quoting syntax {"foo"} iff the metric does not conform to the legacy name format (foo{})
  • The prometheus-metrics-model has a new flag (NameValidationScheme) which determines if validation is done using the legacy or the UTF-8 scheme
  • Scrapers can announce via content negotiation that they support UTF-8 names by adding escaping=allow-utf-8 in the Accept header. In cases where UTF-8 is not available, metric providers can be configured to escape names in a few different ways: values (U__ UTF value escaping for perfect round-tripping), underscores (all invalid chars become _), dots (dots become _dot_, _ becomes __, all other values become ___). Escaping can either be a global default (PrometheusNaming.nameEscapingScheme) or can also be specified in Accept header with the escaping= term, which can be allow-utf-8 (for UTF-8-compatible), underscores, dots, or values.
    This should still be a noop for existing configurations because scrapers will not be passing the escaping key in the Accept header. Existing functionality is maintained.

Work towards prometheus/common#527

@fedetorres93 fedetorres93 changed the title [ UTF-8 support in metric and label names [WIP] UTF-8 support in metric and label names Feb 7, 2024
@fstab
Copy link
Member

fstab commented Feb 8, 2024

Hello @fedetorres93, thanks for working on this. I noticed this is still a draft, but I have a few general comments, unrelated to specific code lines:

  • I think we need a new exposition format for this. If a Prometheus scrape requests application/openmetrics-text; version=1.0.0; then we should return a response that's valid OpenMetrics 1.0.0. Otherwise we will break all scrapes by existing Prometheus servers.
  • The Prometheus Java client is not only used as a standalone library. It is also used for the Prometheus exporter of the OpenTelemetry Java SDK. Therefore, conversion of OpenTelemetry names to Prometheus names must comply with the Prometheus and OpenMetrics Compatibility Spec. There are many components in the OpenTelemetry ecosystem for converting OTel metrics to Prometheus metrics, and we should try to keep names consistent so that users don't get different names depending on how they set up their pipeline. I think it would be good to reach out to OpenTelemetry's Prometheus WG first and define consistent name conversion. The Prometheus WG already started a design doc.

@ywwg
Copy link
Member

ywwg commented Feb 8, 2024

  • then we should return a response that's valid OpenMetrics 1.0.0. Otherwise we will break all scrapes by existing Prometheus servers.

Part of the content negotiation is a new term, escaping=. If the escaping term is absent or anything other than allow-utf-8, then all responses will be converted to valid legacy prometheus format and will not use the new syntax. See prometheus/common#570

Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
Signed-off-by: Federico Torres <federico.torres@grafana.com>
@fedetorres93 fedetorres93 force-pushed the ftorres/quoted-metric-name branch from 7707188 to bd97179 Compare March 7, 2024 20:10
@fedetorres93 fedetorres93 marked this pull request as ready for review March 7, 2024 20:11
@fedetorres93 fedetorres93 changed the title [WIP] UTF-8 support in metric and label names UTF-8 support in metric and label names Mar 7, 2024
Signed-off-by: Federico Torres <federico.torres@grafana.com>
@fedetorres93
Copy link
Author

Hello @fstab, I see that there's still some discussion going on about OpenTelemetry metric names conversion, but I think that the PR is ready for an initial review, at least until a consensus is reached.

@fedetorres93
Copy link
Author

Hey @fstab, I just wanted to check in regarding this PR. I understand there might still be ongoing talks about the OpenTelemetry metric names conversion, but I was wondering if you had a chance to take an initial look.

Looking forward to any feedback you might have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants