Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Oct 24, 2025

Purpose

Support benchmarking #27066

Test Plan

vllm bench serve \
    --model Qwen/Qwen3-Embedding-0.6B \
    --backend openai-embeddings \
    --endpoint /v1/embeddings \
    --dataset-name sharegpt \
    --dataset-path benchmarks/ShareGPT_V3_unfiltered_cleaned_split.json \
    --extra_body '{"encoding_format": "bytes"}'

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 requested a review from noooop October 24, 2025 08:58
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 24, 2025
@mergify mergify bot added the performance Performance-related issues label Oct 24, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for encoding_format="bytes" in the benchmarking tool for pooling requests. The change correctly handles the case where metadata is sent in response headers instead of the body. I've added one comment to improve the robustness of header parsing by using .get() to avoid potential KeyErrors and provide more informative error messages on failure. Overall, the change is good and addresses the intended purpose.

Comment on lines +503 to +504
metadata = json.loads(response.headers["metadata"])
usage = metadata.get("usage", {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using direct dictionary access response.headers["metadata"] is unsafe as it will raise a KeyError if the header is missing. While the surrounding try...except block will catch this, the resulting error message ('metadata') is not very informative for debugging. It's more robust to use .get() to safely access the header and raise a ValueError with a clear error message if it's not present. This will improve error reporting for failed benchmark requests.

metadata_str = response.headers.get("metadata")
if not metadata_str:
    raise ValueError("Missing 'metadata' header for 'bytes' encoding.")
metadata = json.loads(metadata_str)
usage = metadata.get("usage", {})

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 24, 2025 09:05
@DarkLight1337 DarkLight1337 merged commit b7030d9 into vllm-project:main Oct 24, 2025
51 checks passed
@DarkLight1337 DarkLight1337 deleted the bench-binary branch October 24, 2025 11:16
atalhens pushed a commit to atalhens/vllm that referenced this pull request Oct 24, 2025
…llm-project#27467)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
kingsmad pushed a commit to kingsmad/vllm that referenced this pull request Oct 25, 2025
…llm-project#27467)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
rohin-garg pushed a commit to rohin-garg/vllm that referenced this pull request Oct 25, 2025
…llm-project#27467)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…llm-project#27467)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…llm-project#27467)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants