Skip to content

Commit 62a33c2

Browse files
[Serve.LLM] Add avg prompt length metric (#58599)
## Description Add avg prompt length metric When using uniform prompt length (especially in testing), the P50 and P90 computations are skewed due to the 1_2_5 buckets used in vLLM. Average prompt length provides another useful dimension to look at and validate. For example, using uniformly ISL=5000, P50 shows 7200 and P90 shows 9400, and avg accurately shows 5000. <img width="1186" height="466" alt="image" src="https://github.com/user-attachments/assets/4615c3ca-2e15-4236-97f9-72bc63ef9d1a" /> ## Related issues ## Additional information --------- Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Signed-off-by: Rui Qiao <161574667+ruisearch42@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 0c4dcb0 commit 62a33c2

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

python/ray/dashboard/modules/metrics/dashboards/serve_llm_dashboard_panels.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,10 @@
223223
expr='histogram_quantile(0.90, sum by(le, model_name, WorkerId) (rate(ray_vllm_request_prompt_tokens_bucket{{model_name=~"$vllm_model_name", WorkerId=~"$workerid", {global_filters}}}[$interval])))',
224224
legend="P90-{{model_name}}-{{WorkerId}}",
225225
),
226+
Target(
227+
expr='(sum by(model_name, WorkerId) (rate(ray_vllm_request_prompt_tokens_sum{{model_name=~"$vllm_model_name", WorkerId=~"$workerid", {global_filters}}}[$interval]))\n/\nsum by(model_name, WorkerId) (rate(ray_vllm_request_prompt_tokens_count{{model_name=~"$vllm_model_name", WorkerId=~"$workerid", {global_filters}}}[$interval])))',
228+
legend="Average-{{model_name}}-{{WorkerId}}",
229+
),
226230
],
227231
fill=1,
228232
linewidth=1,

0 commit comments

Comments
 (0)