[Renderer] Move Processor out of AsyncLLM #24138

KKSK-DON · 2025-09-03T02:02:13Z

This pr is point 1 and point 3 of #23869

Purpose

This pr move Processor to API server. So the next step #23873 can be benefit from it

Test Plan

unit test
I think it already be covered

manual test

python -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Llama-3.2-3B-Instruct \
  --host 0.0.0.0 --port 8000

curl -s -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Authorization: Bearer dummy" \
  -H "Content-Type: application/json" \
  -d '{
    "model":"meta-llama/Llama-3.2-3B-Instruct",
    "messages":[{"role":"user","content":"Give me two short facts about GPU memory."}],
    "max_tokens":64
  }'

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

ywang96

Thanks for the contribution! By moving Processor out of AsyncLLM, I meant that this following call should by default happening at the API server layer or under LLM class

vllm/vllm/v1/engine/async_llm.py

Lines 274 to 278 in cb55ad8

    
           # Convert Input --> Request. 
        
           prompt_str, request = self.processor.process_inputs( 
        
               request_id, prompt, params, arrival_time, lora_request, 
        
               tokenization_kwargs, trace_headers, priority, data_parallel_rank)

This means we will need to change the input type of add_request to support both EngineCoreRequest and PromptType, the change should look something like the following for AsyncLLM.add_request

async def add_request(
    self,
    request_id: str,
    request: Union[EngineCoreRequest, PromptType], # passed from upper layer
    prompt_str: Optional[str],  # passed from upper layer
    params: Union[SamplingParams, PoolingParams],
    arrival_time: Optional[float] = None,
    lora_request: Optional[LoRARequest] = None,
    tokenization_kwargs: Optional[dict[str, Any]] = None,
    trace_headers: Optional[Mapping[str, str]] = None,
    priority: int = 0,
    data_parallel_rank: Optional[int] = None,
) -> RequestOutputCollector:
    ...
    if isinstance(request, PromptType):
        assert prompt_str is None
        logger.warning_once(...) # deprecation warning
        prompt_str, request = self.processor.process_inputs(...)
    ...

Let me know if you have any questions!

mergify · 2025-09-06T00:31:43Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @KKSK-DON.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-09-06T19:26:33Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @KKSK-DON.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

tests/entrypoints/openai/test_async_tokenization.py

…r processor management Signed-off-by: Yang <lymailforjob@gmail.com>

…alization Signed-off-by: Yang <lymailforjob@gmail.com>

Signed-off-by: Yang <lymailforjob@gmail.com>

mergify · 2025-10-03T07:28:33Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @KKSK-DON.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2025-10-03T07:28:55Z

Let's wait for the entrypoints test results, then update the PR according to #26097 (note the updated return type of self.processor.process_inputs)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 · 2025-10-03T08:11:05Z

I have helped you update it

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>

mergify bot added frontend v1 labels Sep 3, 2025

KKSK-DON changed the title ~~1. add Processor to API server 2. add deprecate warning into AsyncLLM~~ [Refactor] Move Processor out of AsyncLLM Sep 3, 2025

KKSK-DON marked this pull request as ready for review September 3, 2025 23:39

KKSK-DON requested review from WoosukKwon, aarnphm, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners September 3, 2025 23:39

KKSK-DON changed the title ~~[Refactor] Move Processor out of AsyncLLM~~ [Renderer] Move Processor out of AsyncLLM Sep 4, 2025

ywang96 reviewed Sep 4, 2025

View reviewed changes

KKSK-DON force-pushed the refactor/move-processor branch from 65bd313 to 491cf3a Compare September 4, 2025 23:43

KKSK-DON changed the title ~~[Renderer] Move Processor out of AsyncLLM~~ WIP [Renderer] Move Processor out of AsyncLLM Sep 5, 2025

mergify bot added the needs-rebase label Sep 6, 2025

KKSK-DON force-pushed the refactor/move-processor branch from 14a402c to 94394af Compare September 6, 2025 00:39

mergify bot removed the needs-rebase label Sep 6, 2025

mergify bot added the needs-rebase label Sep 6, 2025

KKSK-DON force-pushed the refactor/move-processor branch from 360ec5f to 34c75c7 Compare September 6, 2025 19:30

mergify bot removed the needs-rebase label Sep 6, 2025

KKSK-DON requested review from DarkLight1337 and simon-mo as code owners September 7, 2025 00:22

KKSK-DON commented Sep 7, 2025

View reviewed changes

tests/entrypoints/openai/test_async_tokenization.py Show resolved Hide resolved

KKSK-DON force-pushed the refactor/move-processor branch from 72358ec to f786d14 Compare September 8, 2025 17:36

KKSK-DON changed the title ~~WIP [Renderer] Move Processor out of AsyncLLM~~ [Renderer] Move Processor out of AsyncLLM Sep 8, 2025

KKSK-DON changed the title ~~[Renderer] Move Processor out of AsyncLLM~~ WIP[Renderer] Move Processor out of AsyncLLM Sep 8, 2025

KKSK-DON requested a review from ywang96 September 9, 2025 03:15

KKSK-DON added 2 commits October 2, 2025 09:51

refactor: replace _initialize_processor with _get_processor for bette…

2e6a876

…r processor management Signed-off-by: Yang <lymailforjob@gmail.com>

fix: use self.model_config instead of vllm_config for tokenizer initi…

6a1fddb

…alization Signed-off-by: Yang <lymailforjob@gmail.com>

auto-merge was automatically disabled October 2, 2025 16:51
Head branch was pushed to by a user without write access

KKSK-DON force-pushed the refactor/move-processor branch from 8003baf to 6a1fddb Compare October 2, 2025 16:51

DarkLight1337 mentioned this pull request Oct 2, 2025

[Input] Remove unused prompt field #26097

Merged

5 tasks

fix

71f76b6

Signed-off-by: Yang <lymailforjob@gmail.com>

KKSK-DON requested a review from NickLucche as a code owner October 3, 2025 02:35

fix

ca98b71

Signed-off-by: Yang <lymailforjob@gmail.com>

mergify bot added the needs-rebase label Oct 3, 2025

Merge branch 'main' into refactor/move-processor

f7de98d

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 enabled auto-merge (squash) October 3, 2025 08:11

mergify bot removed the needs-rebase label Oct 3, 2025

DarkLight1337 added 4 commits October 3, 2025 08:12

Same order

35425d7

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix arg name

d623d3c

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix tests

e6d91b3

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix

828f4f6

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 merged commit 812b7f5 into vllm-project:main Oct 3, 2025
51 checks passed

DarkLight1337 mentioned this pull request Oct 3, 2025

[Renderer] Move Processor out of LLMEngine #26165

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Renderer] Move Processor out of AsyncLLM #24138

[Renderer] Move Processor out of AsyncLLM #24138

Uh oh!

KKSK-DON commented Sep 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

ywang96 left a comment •

edited

Loading

Uh oh!

mergify bot commented Sep 6, 2025

Uh oh!

mergify bot commented Sep 6, 2025

Uh oh!

Uh oh!

mergify bot commented Oct 3, 2025

Uh oh!

DarkLight1337 commented Oct 3, 2025 •

edited

Loading

Uh oh!

DarkLight1337 commented Oct 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	# Convert Input --> Request.
	prompt_str, request = self.processor.process_inputs(
	request_id, prompt, params, arrival_time, lora_request,
	tokenization_kwargs, trace_headers, priority, data_parallel_rank)

Uh oh!

[Renderer] Move Processor out of AsyncLLM #24138

[Renderer] Move Processor out of AsyncLLM #24138

Uh oh!

Conversation

KKSK-DON commented Sep 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

ywang96 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Sep 6, 2025

Uh oh!

mergify bot commented Sep 6, 2025

Uh oh!

Uh oh!

mergify bot commented Oct 3, 2025

Uh oh!

DarkLight1337 commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Oct 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KKSK-DON commented Sep 3, 2025 •

edited by github-actions bot

Loading

ywang96 left a comment •

edited

Loading

DarkLight1337 commented Oct 3, 2025 •

edited

Loading