You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: components/metrics/README.md
+27-17Lines changed: 27 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,23 @@
1
1
# Metrics
2
2
3
-
The `metrics` component is a utility that can collect, aggregate, and publish
4
-
metrics from a Dynamo deployment. After collecting and aggregating metrics from
5
-
workers, it exposes them via an HTTP `/metrics` endpoint in Prometheus format
6
-
that other applications or visualization tools like Prometheus server and Grafana can
7
-
pull from.
8
-
9
-
**Note**: This is a demo implementation. The metrics component is currently under active development and this documentation will change as the implementation evolves.
10
-
- In this demo the metrics names use the prefix "llm", but in production they will be prefixed with "nv_llm" (e.g., the HTTP `/metrics` endpoint will serve metrics with "nv_llm" prefixes)
3
+
⚠️ **DEPRECATION NOTICE** ⚠️
4
+
5
+
**This `metrics` component is unmaintained and being deprecated.**
6
+
7
+
The deprecated `metrics` component is being replaced by the **`MetricsRegistry`** built-in functionality that is now available directly in the `DistributedRuntime` framework. The `MetricsRegistry` provides:
8
+
9
+
**For new projects and existing deployments, please migrate to using `MetricsRegistry` instead of this component.**
10
+
11
+
This component may be migrated to the MetricsRegistry in the future.
12
+
13
+
**📖 See the [Dynamo MetricsRegistry Guide](../../docs/guides/metrics.md) for detailed information on using the new metrics system.**
14
+
15
+
---
16
+
17
+
The deprecated `metrics` component is a utility for collecting, aggregating, and publishing metrics from a Dynamo deployment, but it is unmaintained and being deprecated in favor of `MetricsRegistry`.
18
+
19
+
**Note**: This is a demo implementation. The deprecated `metrics` component is no longer under active development.
20
+
- In this demo the metrics names use the prefix "llm", but in production they will be prefixed with "dynamo" (e.g., the HTTP `/metrics` endpoint will serve metrics with "dynamo" prefixes)
11
21
- This demo will only work when using examples/llm/configs/agg.yml-- other configurations will not work
12
22
13
23
<divalign="center">
@@ -16,7 +26,7 @@ pull from.
16
26
17
27
## Quickstart
18
28
19
-
To start the `metrics` component, simply point it at the `namespace/component/endpoint`
29
+
To start the deprecated `metrics` component, simply point it at the `namespace/component/endpoint`
20
30
trio for the Dynamo workers that you're interested in monitoring metrics on.
21
31
22
32
This will:
@@ -45,14 +55,14 @@ will get automatically discovered and the warnings will stop.
45
55
46
56
## Workers
47
57
48
-
The `metrics` component needs running workers to gather metrics from,
58
+
The deprecated `metrics` component needs running workers to gather metrics from,
49
59
so below are some examples of workers and how they can be monitored.
50
60
51
61
### Mock Worker
52
62
53
-
To try out how `metrics` works, there is a demo Rust-based
63
+
To try out how the deprecated `metrics` component works, there is a demo Rust-based
54
64
[mock worker](src/bin/mock_worker.rs) that provides sample data through two mechanisms:
55
-
1. Exposes a stats handler at `dynamo/MyComponent/my_endpoint` that responds to polling requests (from `metrics`) with randomly generated `ForwardPassMetrics` data
65
+
1. Exposes a stats handler at `dynamo/MyComponent/my_endpoint` that responds to polling requests (from the deprecated `metrics` component) with randomly generated `ForwardPassMetrics` data
56
66
2. Publishes mock `KVHitRateEvent` data every second to demonstrate event-based metrics
57
67
58
68
Step 1: Launch a mock workers via the following command (if already built):
Copy file name to clipboardExpand all lines: deploy/metrics/README.md
+10-7Lines changed: 10 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,7 @@ As of Q2 2025, Dynamo HTTP Frontend metrics are exposed when you build container
60
60
61
61
- Start the [components/metrics](../../components/metrics/README.md) application to begin monitoring for metric events from dynamo workers and aggregating them on a Prometheus metrics endpoint: `http://localhost:9091/metrics`.
62
62
- Uncomment the appropriate lines in prometheus.yml to poll port 9091.
63
-
- Start worker(s) that publishes KV Cache metrics: [examples/rust/service_metrics/bin/server](../../lib/runtime/examples/service_metrics/README.md)` can populate dummy KV Cache metrics.
63
+
- Start worker(s) that publishes KV Cache metrics: [lib/runtime/examples/service_metrics/README.md](../../lib/runtime/examples/service_metrics/README.md) can populate dummy KV Cache metrics.
64
64
65
65
66
66
## Configuration
@@ -95,16 +95,19 @@ The following configuration files should be present in this directory:
95
95
-[grafana_dashboards/grafana-dcgm-metrics.json](./grafana_dashboards/grafana-dcgm-metrics.json): Contains Grafana dashboard configuration for DCGM GPU metrics
96
96
-[grafana_dashboards/grafana-llm-metrics.json](./grafana_dashboards/grafana-llm-metrics.json): This file, which is being phased out, contains the Grafana dashboard configuration for LLM-specific metrics. It requires an additional `metrics` component to operate concurrently. A new version is under development.
97
97
98
-
## Running the example`metrics` component
98
+
## Running the deprecated`metrics` component
99
99
100
-
IMPORTANT: This section is being phased out, and some metrics may not function as expected. A new solution is under development.
100
+
⚠️ **DEPRECATION NOTICE** ⚠️
101
101
102
-
When you run the example [components/metrics](../../components/metrics/README.md) component, it exposes a Prometheus /metrics endpoint with the followings (defined in [../../components/metrics/src/lib.rs](../../components/metrics/src/lib.rs)):
103
-
-`llm_requests_active_slots`: Number of currently active request slots per worker
102
+
When you run the example [components/metrics](../../components/metrics/README.md) component, it exposes a Prometheus /metrics endpoint with the following metrics (defined in [components/metrics/src/lib.rs](../../components/metrics/src/lib.rs)):
103
+
104
+
**⚠️ The following `llm_kv_*` metrics are deprecated:**
105
+
106
+
-`llm_requests_active_slots`: Active request slots per worker
104
107
-`llm_requests_total_slots`: Total available request slots per worker
105
-
-`llm_kv_blocks_active`: Number of active KV blocks per worker
108
+
-`llm_kv_blocks_active`: Active KV blocks per worker
106
109
-`llm_kv_blocks_total`: Total KV blocks available per worker
107
-
-`llm_kv_hit_rate_percent`: Cumulative KV Cache hit percent per worker
110
+
-`llm_kv_hit_rate_percent`: KV Cache hit percent per worker
108
111
-`llm_load_avg`: Average load across workers
109
112
-`llm_load_std`: Load standard deviation across workers
0 commit comments