[Benchmark] Enable benchmark to run with `encoding_format="bytes"` #27467

DarkLight1337 · 2025-10-24T08:58:18Z

Purpose

Support benchmarking #27066

Test Plan

vllm bench serve \
    --model Qwen/Qwen3-Embedding-0.6B \
    --backend openai-embeddings \
    --endpoint /v1/embeddings \
    --dataset-name sharegpt \
    --dataset-path benchmarks/ShareGPT_V3_unfiltered_cleaned_split.json \
    --extra_body '{"encoding_format": "bytes"}'

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

gemini-code-assist

Code Review

This pull request adds support for encoding_format="bytes" in the benchmarking tool for pooling requests. The change correctly handles the case where metadata is sent in response headers instead of the body. I've added one comment to improve the robustness of header parsing by using .get() to avoid potential KeyErrors and provide more informative error messages on failure. Overall, the change is good and addresses the intended purpose.

gemini-code-assist · 2025-10-24T09:00:20Z

vllm/benchmarks/lib/endpoint_request_func.py

+                    metadata = json.loads(response.headers["metadata"])
+                    usage = metadata.get("usage", {})


Using direct dictionary access response.headers["metadata"] is unsafe as it will raise a KeyError if the header is missing. While the surrounding try...except block will catch this, the resulting error message ('metadata') is not very informative for debugging. It's more robust to use .get() to safely access the header and raise a ValueError with a clear error message if it's not present. This will improve error reporting for failed benchmark requests.

metadata_str = response.headers.get("metadata") if not metadata_str: raise ValueError("Missing 'metadata' header for 'bytes' encoding.") metadata = json.loads(metadata_str) usage = metadata.get("usage", {})

…llm-project#27467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…llm-project#27467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

[Benchmark] Enable benchmark to run with encoding_format="bytes"

daae8a1

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 requested a review from noooop October 24, 2025 08:58

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 24, 2025

mergify bot added the performance Performance-related issues label Oct 24, 2025

gemini-code-assist bot reviewed Oct 24, 2025

View reviewed changes

noooop approved these changes Oct 24, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) October 24, 2025 09:05

DarkLight1337 merged commit b7030d9 into vllm-project:main Oct 24, 2025
51 checks passed

DarkLight1337 deleted the bench-binary branch October 24, 2025 11:16

atalhens pushed a commit to atalhens/vllm that referenced this pull request Oct 24, 2025

[Benchmark] Enable benchmark to run with encoding_format="bytes" (v…

eb9f2c0

…llm-project#27467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

kingsmad pushed a commit to kingsmad/vllm that referenced this pull request Oct 25, 2025

[Benchmark] Enable benchmark to run with encoding_format="bytes" (v…

637ae37

…llm-project#27467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

rohin-garg pushed a commit to rohin-garg/vllm that referenced this pull request Oct 25, 2025

[Benchmark] Enable benchmark to run with encoding_format="bytes" (v…

def7c46

…llm-project#27467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Benchmark] Enable benchmark to run with `encoding_format="bytes"` #27467

[Benchmark] Enable benchmark to run with `encoding_format="bytes"` #27467

Uh oh!

DarkLight1337 commented Oct 24, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		metadata = json.loads(response.headers["metadata"])
		usage = metadata.get("usage", {})

Uh oh!

[Benchmark] Enable benchmark to run with encoding_format="bytes" #27467

[Benchmark] Enable benchmark to run with encoding_format="bytes" #27467

Uh oh!

Conversation

DarkLight1337 commented Oct 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Benchmark] Enable benchmark to run with `encoding_format="bytes"` #27467

[Benchmark] Enable benchmark to run with `encoding_format="bytes"` #27467

DarkLight1337 commented Oct 24, 2025 •

edited by github-actions bot

Loading