-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support guided decoding for vllm async engine #2391
base: main
Are you sure you want to change the base?
Conversation
Which version is required? |
latest version after 0.6.2, waiting for vllm to release new version |
2968700
to
cd0812a
Compare
vllm has release v0.6.3, is this PR ready to work? |
I will do the test.
…________________________________
寄件者: Xuye Qin ***@***.***>
寄件日期: 星期四, 10月 17, 2024 5:38:13 下午
收件者: xorbitsai/inference ***@***.***>
副本: wxiwnd ***@***.***>; Author ***@***.***>
主旨: Re: [xorbitsai/inference] feat: support guided decoding for vllm async engine (PR #2391)
vllm has release v0.6.3, is this PR ready to work?
—
Reply to this email directly, view it on GitHub<#2391 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJSDNXWKTHLIQE35VTQLT23Z36AQDAVCNFSM6AAAAABPJCVJQOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGA2DOMZWGE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
4d9e044
to
852c86c
Compare
Works on my machine now
|
Can you confirm there is no exception if the vllm is an old version? |
852c86c
to
894352e
Compare
Signed-off-by: wxiwnd <wxiwnd@outlook.com>
894352e
to
820c726
Compare
Signed-off-by: wxiwnd <wxiwnd@outlook.com>
823887f
to
df849b1
Compare
It now works properly even if vllm version < 0.6.3 |
# FIXME schema replica error in Pydantic | ||
# source: ResponseFormatJSONSchema in ResponseFormat | ||
# use alias | ||
# _response_format: Optional[ResponseFormat] = Field(alias="response_format") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any solution to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears to be the same issue as Xinference Issue #2032, and I have not yet found a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have developed a parameter parser for the RESTful API component to ensure that the functionality remains intact. Therefore, these parts can be safely ignored, even though parsing requests in the RESTful API part is somewhat "dirty".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, we solved by copying the openai pydantic model into xinference, refer to https://github.com/xorbitsai/inference/pull/2231/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the error is Field name "schema" shadows a BaseModel attribute; use a different field name with "alias='schema'".
I found https://github.com/openai/openai-python/blob/main/src/openai/types/shared_params/response_format_json_schema.py#L11 maybe the source of conflict. Because another declaration of json_schema https://github.com/openai/openai-python/blob/main/src/openai/types/shared/response_format_json_schema.py#L27
use schema_ instead of schema.
I will try to fix this by define ResponseFormat myself.
Support Guided Decoding for vllm async engine
waiting for vllm release, a version bump is needed.
#1562
vllm-project/vllm#8252