[Bugfix] Fix bug when dataset is json #15899

Chenyaaang · 2025-04-01T18:53:37Z

In benchmark_serving_structured_output.py, when dataset is json, use the original loaded schema instead of creating json_schemas.

The sample command provided in benchmark_serving_structured_output.py doesn't work because it fails at sample_requests() for json dataset.
python benchmarks/benchmark_serving_structured_output.py
--backend
--model <your_model>
--dataset json
--structured-output-ratio 1.0
--structured-output-backend xgrammar
--request-rate 10
--num-prompts 1000

Signed-off-by: Chenyaaang <chenyangli@google.com>

github-actions · 2025-04-01T18:53:46Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

yaochengji · 2025-04-02T05:27:59Z

To better understand the fix, could you please provide a test case that demonstrates the bug you've addressed?

DarkLight1337 · 2025-04-02T08:57:48Z

I think it would be cleaner to just add another if-else branch.

elif args.dataset == "json":
    json_schemas = [schema]

That way we don't have to modify gen_prompt and get_schema at all

Chenyaaang · 2025-04-02T17:47:26Z

json_schemas

len(json_schemas) should be same as prompt size, so it means we need to copy schema num_prompts times (that's unncessary)

Chenyaaang · 2025-04-02T17:49:50Z

To better understand the fix, could you please provide a test case that demonstrates the bug you've addressed?

Added in the description.

DarkLight1337 · 2025-04-03T09:24:39Z

json_schemas
len(json_schemas) should be same as prompt size, so it means we need to copy schema num_prompts times (that's unncessary)

How about json_schemas = [schema] * args.num_prompts?

Edit: actually it doesn't look like they need to be of the same length since we apply modulus operator before indexing the list of schemas.

Signed-off-by: Chenyaaang <chenyangli@google.com>

…chmark

Chenyaaang · 2025-04-10T17:47:04Z

json_schemas
len(json_schemas) should be same as prompt size, so it means we need to copy schema num_prompts times (that's unncessary)
How about json_schemas = [schema] * args.num_prompts?

Edit: actually it doesn't look like they need to be of the same length since we apply modulus operator before indexing the list of schemas.

Thanks, fixed. PTAL.

DarkLight1337

Thanks for bearing with me!

Signed-off-by: Chenyaaang <chenyangli@google.com>

Signed-off-by: Chenyaaang <chenyangli@google.com> Signed-off-by: Yang Wang <elainewy@meta.com>

Signed-off-by: Chenyaaang <chenyangli@google.com>

Signed-off-by: Chenyaaang <chenyangli@google.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

fix bug when dataset is json

3865547

Signed-off-by: Chenyaaang <chenyangli@google.com>

yaochengji added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 2, 2025

mergify bot added the tpu Related to Google TPUs label Apr 9, 2025

Chenyaaang added 2 commits April 10, 2025 17:45

fix comment

5dcb797

Signed-off-by: Chenyaaang <chenyangli@google.com>

Merge remote-tracking branch 'origin/main' into structured-output-ben…

359af9f

…chmark

mergify bot removed the tpu Related to Google TPUs label Apr 10, 2025

DarkLight1337 approved these changes Apr 10, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) April 10, 2025 17:48

DarkLight1337 merged commit 5fbab20 into vllm-project:main Apr 10, 2025
25 checks passed

p88h pushed a commit to p88h/vllm that referenced this pull request Apr 10, 2025

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

e1295bc

Signed-off-by: Chenyaaang <chenyangli@google.com>

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

438b004

Signed-off-by: Chenyaaang <chenyangli@google.com> Signed-off-by: Yang Wang <elainewy@meta.com>

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

70c6016

Signed-off-by: Chenyaaang <chenyangli@google.com>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

39c3312

Signed-off-by: Chenyaaang <chenyangli@google.com>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

7a91954

Signed-off-by: Chenyaaang <chenyangli@google.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix bug when dataset is json #15899

[Bugfix] Fix bug when dataset is json #15899

Uh oh!

Chenyaaang commented Apr 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

yaochengji commented Apr 2, 2025

Uh oh!

DarkLight1337 commented Apr 2, 2025

Uh oh!

Chenyaaang commented Apr 2, 2025

Uh oh!

Chenyaaang commented Apr 2, 2025

Uh oh!

DarkLight1337 commented Apr 3, 2025 •

edited

Loading

Uh oh!

Chenyaaang commented Apr 10, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

[Bugfix] Fix bug when dataset is json #15899

[Bugfix] Fix bug when dataset is json #15899

Uh oh!

Conversation

Chenyaaang commented Apr 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

yaochengji commented Apr 2, 2025

Uh oh!

DarkLight1337 commented Apr 2, 2025

Uh oh!

Chenyaaang commented Apr 2, 2025

Uh oh!

Chenyaaang commented Apr 2, 2025

Uh oh!

DarkLight1337 commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Chenyaaang commented Apr 10, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Chenyaaang commented Apr 1, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Apr 3, 2025 •

edited

Loading