-
Notifications
You must be signed in to change notification settings - Fork 688
feat: enable custom metrics prefix #2432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a runtime-configurable metrics prefix for the Dynamo frontend and HTTP service metrics. Introduces a CLI flag to set the prefix and propagates it via the DYN_METRICS_PREFIX environment variable. Metrics initialization now derives names using this prefix. Includes an integration test validating default and overridden prefixes. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/Operator
participant CLI as Frontend CLI
participant Env as Environment
participant RT as Runtime Init
participant Svc as HTTP Service Metrics
participant Prom as Prometheus
User->>CLI: Run with --metrics-prefix (optional)
CLI->>Env: Set DYN_METRICS_PREFIX (if flag provided)
CLI->>RT: Initialize runtime
RT->>Svc: Start HTTP service
Svc->>Env: Read DYN_METRICS_PREFIX or default
Svc->>Svc: Construct metric names with prefix
Prom->>Svc: GET /metrics
Svc-->>Prom: Expose prefixed metrics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
components/frontend/src/dynamo/frontend/main.py(2 hunks)lib/llm/src/http/service/metrics.rs(2 hunks)lib/llm/tests/http_metrics.rs(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
lib/llm/tests/http_metrics.rs (2)
lib/runtime/src/pipeline/network.rs (1)
metrics(338-340)lib/llm/src/http/service/metrics.rs (3)
new(119-213)new(315-336)new(407-416)
lib/llm/src/http/service/metrics.rs (2)
lib/llm/src/http/service/service_v2.rs (1)
new(29-35)lib/runtime/src/metrics.rs (1)
prefix(394-400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Mirror Repository to GitLab
- GitHub Check: pre-merge-rust (.)
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (3)
components/frontend/src/dynamo/frontend/main.py (1)
136-141: Good addition: CLI flag and clear precedence semanticsAdding --metrics-prefix with explicit fallback to env var or default is clear and backward-compatible.
lib/llm/src/http/service/metrics.rs (2)
15-20: Public constants look good and align with intended defaultsFRONTEND_METRIC_PREFIX and METRICS_PREFIX_ENV are well-named and make downstream usage (e.g., tests) straightforward.
107-119: Docs are clear and comprehensiveThe docstring enumerating the env var and metric names is helpful and accurate.
|
@ryan-lempka This looks good. Can you do all the Code Rabbit suggestions? Particularly the one about avoiding |
@grahamking agreed, these suggestions are solid. I'll add them all now. |
82120ba to
bfac89b
Compare
bfac89b to
cd3b8e2
Compare
|
Is it the case that NIM LLM will only use |
| async fn metrics_prefix_default_then_env_override() { | ||
| // Case 1: default prefix | ||
| env::remove_var(metrics::METRICS_PREFIX_ENV); | ||
| let svc1 = HttpService::builder().port(9101).build().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this looks like it's following examples in http-service.rs, and a bit sad that many tests are not using random available ports (port 0). For tests, it's best to use random available ports so that tests running in parallel, don't run into collisions with other existing services and/or tests. Something like this would be preferred:
async fn create_http_service_with_random_port() -> (HttpService, u16) {
// Bind to port 0 to get a random available port
let listener = tokio::net::TcpListener::bind("127.0.0.1:0").await.unwrap();
let actual_port = listener.local_addr().unwrap().port();
// Drop the listener since HttpService will create its own
drop(listener);
// Create service with the actual port
let service = HttpService::builder().port(actual_port).build().unwrap();
(service, actual_port)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will use random ports in the future.
|
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Overview:
Allow for the specification of a custom metrics prefix for the frontend metrics names.
Details:
dynamo_frontend; set tonv_llm_http_servicefor NIM LLM compatibility or to use a preferred prefixWhere should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Tests