Possible invalid request formatting for max_completion_tokens
#210
Replies: 1 comment
-
Tldr: you can ignore that error. We set both The standard for setting a max tokens is kinda of messy. Some model servers support vLLM started with |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Just looking in to this, but wanted to report it in case it was a known issue or someone has more information.
While running guidellm against a locally running
vllm serve
, I am seeing a very large amount of these log messages in the vLLM output:Running a request manually against the endpoint is happy, with no errors in the vllm logs:
Which leads me to believe the prompt being formed by guidellm must be placing
max_completion_tokens
somewhere other than as a top level property of the request struct.Beta Was this translation helpful? Give feedback.
All reactions