[Frontend] Support override generation config in args #12409

liuyanyi · 2025-01-24T15:09:27Z

Support override generation config in args

In my past pr #11164, generation config could be load from model or other file.

In this pr, generation config can be override by user in args or model config, this allow control generation config in cli args.

override_generation_config has been added to ModelConfig and EngineArgs. A test has been added too.

Example Usage:

vllm serve Qwen/Qwen2.5-1.5B-Instruct --override-generation-config "{"top_k": 5}"

log will shows

INFO 01-24 23:05:28 serving_chat.py:90] Overwriting default chat sampling param with: {'top_k': 5}
INFO 01-24 23:05:28 serving_completion.py:54] Overwriting default completion sampling param with: {'top_k': 5}

Signed-off-by: liuyanyi <wolfsonliu@163.com>

github-actions · 2025-01-24T15:09:40Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2025-01-24T15:38:22Z

vllm/engine/arg_utils.py

+            default=None,
+            help="Override or set generation config. "
+            "Defaults to None, will use for the default generation config. "
+            "e.g. ``{\"temperature\": 0.5, \"top_k\": 50}``.")


Can you explain the use of "auto" here?

you mean the "auto" for "--generation-config"?
if auto is set, will load generation config from model dir, "--override-generation-config" allow user to manually set the generation config.
I think the newly added test should show the behavior between these parameters

Yes, can you update the help string to include this information?

Signed-off-by: liuyanyi <wolfsonliu@163.com>

DarkLight1337

Thanks for improving on this!

DarkLight1337 · 2025-01-27T05:56:29Z

Please merge from main to fix the merge conflicts.

liuyanyi · 2025-01-27T06:52:49Z

Please merge from main to fix the merge conflicts.

Done

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com> Signed-off-by: Isotr0py <2037008807@qq.com>

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

ChewKokWah · 2025-02-12T15:52:45Z

Support override generation config in args

In my past pr #11164, generation config could be load from model or other file.

In this pr, generation config can be override by user in args or model config, this allow control generation config in cli args.

override_generation_config has been added to ModelConfig and EngineArgs. A test has been added too.

Example Usage:

vllm serve Qwen/Qwen2.5-1.5B-Instruct --override-generation-config "{"top_k": 5}"

log will shows
INFO 01-24 23:05:28 serving_chat.py:90] Overwriting default chat sampling param with: {'top_k': 5}
INFO 01-24 23:05:28 serving_completion.py:54] Overwriting default completion sampling param with: {'top_k': 5}

When we had already set our own generation config in vllm.SamplingParams(), do we need to set again the sampling parameter with override-generation-config in order to override generation_config.json ? I used to thought that setting vllm.SamplingParams() will override generation_config.json, am I correct?
If we set 2 different parameter in vllm.SamplingParams() vs override-generation-config, which one will vLLM use?

DarkLight1337 · 2025-02-12T15:56:17Z

--override-generation-config only sets the default sampling params. Per-request sampling params will override this.

liuyanyi added 3 commits January 24, 2025 22:32

support override gen config

2c11df8

Signed-off-by: liuyanyi <wolfsonliu@163.com>

add tests

9d3f277

Signed-off-by: liuyanyi <wolfsonliu@163.com>

fix local path

c662d0e

Signed-off-by: liuyanyi <wolfsonliu@163.com>

DarkLight1337 reviewed Jan 24, 2025

View reviewed changes

rewrite help

bc8697c

Signed-off-by: liuyanyi <wolfsonliu@163.com>

DarkLight1337 approved these changes Jan 25, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) January 25, 2025 06:55

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 25, 2025

Merge branch 'main' into override_generation_config

438c4b2

auto-merge was automatically disabled January 27, 2025 06:50
Head branch was pushed to by a user without write access

DarkLight1337 enabled auto-merge (squash) January 27, 2025 06:52

simon-mo disabled auto-merge January 29, 2025 09:40

simon-mo merged commit ff7424f into vllm-project:main Jan 29, 2025
42 of 49 checks passed

rasmith pushed a commit to rasmith/vllm that referenced this pull request Jan 30, 2025

[Frontend] Support override generation config in args (vllm-project#1…

e55b43a

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

Isotr0py pushed a commit to Isotr0py/vllm that referenced this pull request Feb 2, 2025

[Frontend] Support override generation config in args (vllm-project#1…

6223076

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com> Signed-off-by: Isotr0py <2037008807@qq.com>

NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Feb 7, 2025

[Frontend] Support override generation config in args (vllm-project#1…

4c35856

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

ShangmingCai pushed a commit to ShangmingCai/vllm that referenced this pull request Feb 10, 2025

[Frontend] Support override generation config in args (vllm-project#1…

e77fe70

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

GWS0428 pushed a commit to GWS0428/VARserve that referenced this pull request Feb 12, 2025

[Frontend] Support override generation config in args (vllm-project#1…

c4c7279

…2409) Signed-off-by: liuyanyi <wolfsonliu@163.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Support override generation config in args #12409

[Frontend] Support override generation config in args #12409

liuyanyi commented Jan 24, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jan 24, 2025

DarkLight1337 Jan 24, 2025

liuyanyi Jan 25, 2025

DarkLight1337 Jan 25, 2025

DarkLight1337 left a comment

DarkLight1337 commented Jan 27, 2025

liuyanyi commented Jan 27, 2025

ChewKokWah commented Feb 12, 2025

DarkLight1337 commented Feb 12, 2025

[Frontend] Support override generation config in args #12409

[Frontend] Support override generation config in args #12409

Conversation

liuyanyi commented Jan 24, 2025 • edited by github-actions bot Loading

github-actions bot commented Jan 24, 2025

DarkLight1337 Jan 24, 2025

Choose a reason for hiding this comment

liuyanyi Jan 25, 2025

Choose a reason for hiding this comment

DarkLight1337 Jan 25, 2025

Choose a reason for hiding this comment

DarkLight1337 left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Jan 27, 2025

liuyanyi commented Jan 27, 2025

ChewKokWah commented Feb 12, 2025

DarkLight1337 commented Feb 12, 2025

liuyanyi commented Jan 24, 2025 •

edited by github-actions bot

Loading