Skip to content

Conversation

@MengqingCao
Copy link
Collaborator

see #60

@MengqingCao
Copy link
Collaborator Author

cc @Yikun @wangxiyuan

|---------|-----------|------|
| Chunked Prefill || Plan in 2025 Q1 |
| Automatic Prefix Caching || Improve performance in 2025 Q1 |
| Automatic Prefix Caching || Improve performance in 2025 Q2 (Not supported in release version) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only add doc for main in main branch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx! done.

| Best of |||
| Beam search |||
| Guided Decoding || Plan in 2025 Q1 |
| Tensor Parallel || Only "mp" in main ("ray" and "mp" in release version) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: MengqingCao <cmq0113@163.com>
@wangxiyuan wangxiyuan merged commit c935b70 into vllm-project:main Feb 17, 2025
3 checks passed
@MengqingCao MengqingCao deleted the fix_patch branch February 25, 2025 08:45
ttanzhiqiang pushed a commit to ttanzhiqiang/vllm-ascend that referenced this pull request Apr 27, 2025
Check and update the feature support table.

- both multi-step and speculative decoding require adaptation of corresponding workers
- prompt adapter (finetune method) require adaption in worker.py and model_runner.py

Signed-off-by: MengqingCao <cmq0113@163.com>
Skywalker-EP pushed a commit to Skywalker-EP/vllm-ascend that referenced this pull request Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants