-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[Renderer] Move Processor out of AsyncLLM #24138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Renderer] Move Processor out of AsyncLLM #24138
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! By moving Processor out of AsyncLLM, I meant that this following call should by default happening at the API server layer or under LLM class
vllm/vllm/v1/engine/async_llm.py
Lines 274 to 278 in cb55ad8
| # Convert Input --> Request. | |
| prompt_str, request = self.processor.process_inputs( | |
| request_id, prompt, params, arrival_time, lora_request, | |
| tokenization_kwargs, trace_headers, priority, data_parallel_rank) | |
This means we will need to change the input type of add_request to support both EngineCoreRequest and PromptType, the change should look something like the following for AsyncLLM.add_request
async def add_request(
self,
request_id: str,
request: Union[EngineCoreRequest, PromptType], # passed from upper layer
prompt_str: Optional[str], # passed from upper layer
params: Union[SamplingParams, PoolingParams],
arrival_time: Optional[float] = None,
lora_request: Optional[LoRARequest] = None,
tokenization_kwargs: Optional[dict[str, Any]] = None,
trace_headers: Optional[Mapping[str, str]] = None,
priority: int = 0,
data_parallel_rank: Optional[int] = None,
) -> RequestOutputCollector:
...
if isinstance(request, PromptType):
assert prompt_str is None
logger.warning_once(...) # deprecation warning
prompt_str, request = self.processor.process_inputs(...)
...Let me know if you have any questions!
65bd313 to
491cf3a
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
14a402c to
94394af
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
360ec5f to
34c75c7
Compare
72358ec to
f786d14
Compare
…r processor management Signed-off-by: Yang <lymailforjob@gmail.com>
…alization Signed-off-by: Yang <lymailforjob@gmail.com>
Head branch was pushed to by a user without write access
8003baf to
6a1fddb
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
|
Let's wait for the entrypoints test results, then update the PR according to #26097 (note the updated return type of |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
I have helped you update it |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
This pr is point 1 and point 3 of #23869
Purpose
This pr move Processor to API server. So the next step #23873 can be benefit from it
Test Plan
unit test
I think it already be covered
manual test
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.