Skip to content

Commit 20a15cc

Browse files
committed
[V1][Metrics] Deprecate vllm:model_forward/execute_time_milliseconds
Metrics originally added by vllm-project#9659 These seem to be of questionable value relative to the existing prefill, decode, and inference time metrics. And since they would be challenging to implement in V1, and they don't conform to the standard of using seconds as units, let's deprecate them Signed-off-by: Mark McLoughlin <markmc@redhat.com>
1 parent df51e19 commit 20a15cc

File tree

1 file changed

+12
-4
lines changed

1 file changed

+12
-4
lines changed

vllm/engine/metrics.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -190,18 +190,26 @@ def __init__(self, labelnames: List[str], vllm_config: VllmConfig):
190190
"DEPRECATED: use vllm:request_queue_time_seconds instead."),
191191
labelnames=labelnames,
192192
buckets=request_latency_buckets)
193+
194+
# Deprecated in 0.8 - use prefill/decode/inference time metrics
195+
# TODO: in 0.9, only enable if show_hidden_metrics=True
193196
self.histogram_model_forward_time_request = self._histogram_cls(
194197
name="vllm:model_forward_time_milliseconds",
195-
documentation=
196-
"Histogram of time spent in the model forward pass in ms.",
198+
documentation=(
199+
"Histogram of time spent in the model forward pass in ms. "
200+
"DEPRECATED: use prefill/decode/inference time metrics instead."
201+
),
197202
labelnames=labelnames,
198203
buckets=build_1_2_3_5_8_buckets(3000))
199204
self.histogram_model_execute_time_request = self._histogram_cls(
200205
name="vllm:model_execute_time_milliseconds",
201-
documentation=
202-
"Histogram of time spent in the model execute function in ms.",
206+
documentation=(
207+
"Histogram of time spent in the model execute function in ms."
208+
"DEPRECATED: use prefill/decode/inference time metrics instead."
209+
),
203210
labelnames=labelnames,
204211
buckets=build_1_2_3_5_8_buckets(3000))
212+
205213
# Metadata
206214
self.histogram_num_prompt_tokens_request = self._histogram_cls(
207215
name="vllm:request_prompt_tokens",

0 commit comments

Comments
 (0)