-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make request payload size configurable #2444
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test in
sglang/rust/py_test/test_launch_server.py
Line 175 in 993956c
def test_2_add_and_remove_worker(self): |
Ag, added some tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice bug catch!
@MrAta can you release a new version after this is merged? |
Signed-off-by: Ata Fatahi <immrata@gmail.com>
Signed-off-by: Ata Fatahi <immrata@gmail.com>
Signed-off-by: Ata Fatahi <immrata@gmail.com>
Signed-off-by: Ata Fatahi <immrata@gmail.com>
Signed-off-by: Ata Fatahi <immrata@gmail.com>
Motivation
For scenarios where we have long context lengths (e.g. 32k) and have multiple prompts in a single completion request, the size of the http request payload exceeds the default 2MB that rust json has which fails to send the request to the engine.
Modifications
This PR makes the request payload size configurable by adding a new
max_request_payload_size
field to theConfig
struct.Checklist